专利摘要:
patent summary: "image coding method, image decoding method, image coding device, image decoding device and image coding / decoding device". This is an indication of dependency that is included at the beginning of a packet, that is, near a parse target slice header or set of parameters, and is delivered by a signal. This is achieved, for example, by including the dependency indication at the beginning of the slice header, ideally after a syntax element that identifies the parameter set and before the slice address by using a separate message and providing an indication. dependency to a nalu header or by using a special nalu type used in a nalu that carries a dependent slice.
公开号:BR112015004140A2
申请号:R112015004140
申请日:2013-09-19
公开日:2019-10-29
发明作者:Esenlik Semih;Narroschke Matthias;Wedi Thomas
申请人:Velos Media International Limited;
IPC主号:
专利说明:

Invention Patent Descriptive Report for ’’ IMAGE ENCODING METHOD, IMAGE DECODING METHOD, IMAGE ENCODING APPARATUS, IMAGE DECODING APPLIANCE AND IMAGE ENCODING AND DECODING APPLIANCE ”.
TECHNICAL FIELD [0001] The present invention relates to an image encoding method for encoding an image and an image decoding method for decoding an image.
BACKGROUND TECHNIQUE [0002] Most of today's standardized video encoding algorithms are based on hybrid video encoding. In hybrid video encoding methods, several different lossless and lossy compression schemes are used in order to achieve the desired compression gain. Hybrid video encoding is also the basis for ITU-T standards (H.26x standards such as H.261,
H.263) as well as ISO / IEC standards (MPEG-X standards such as MPEG-1, MPEG-2 and MPEG-4).
[0003] The most recent and advanced video encoding standard is currently the standard denoted H.264 / MPEG-4 (AVC) advanced video encoding. This is a result of standardization efforts by a joint video team (JVT), a joint team of ITU-T and ISO / IEC MPEG groups.
[0004] Furthermore, a video coding standard called High Efficiency Video Coding (HEVC) is being considered by the Collaborative Video Coding Team (JCTVC), with the particular objective of improving efficiency in relation to video coding. high resolution video.
CITATION LIST
NON-PATENT LITERATURE
2/102 [0005] Non-Patent Literature 1: C. Gordon, et al., 'Wavefront Parallel Processing for HEVC Encoding and Decoding, JCTVC-F274v2, from the Meeting in Torino, July 2011, Internet <URL: http: / / phenix. int-evry. fr>
[0006] Non-Patent Literature 2: A. Fuldseth, et al ·, Tiles, JCTVC-F355-v1, from the Meeting in Torino, July 2011, Internet <URL: http: // phenix. int-evry. fr>
[0007] Non-Patent Literature 3: JCTVC-J1003 d7, High efficiency video coding (HEVC) text specification draft 8, July 2012, page 73, dependent slice flag, Internet <URL: http: // phenix. ITsudparis.eu/jct/>
SUMMARY OF THE INVENTION
TECHNICAL PROBLEM [0008] However, there is a problem where an image encoding method, an image decoding method and the like are not sufficiently efficient.
[0009] Therefore, the present invention provides an image encoding method and an image decoding method that have the ability to increase processing efficiency.
PARAQ PROBLEM SOLUTION [0010] An image encoding method, according to one aspect of the present invention, is an image encoding method for performing encoding processing by partitioning a picture into a plurality of slices, the encoding method of The image comprises transmitting a bit stream that includes: a dependent slice enable indicator indicating whether or not the figure includes a dependent slice on which the encoding processing is performed depending on a result of the encoding processing on a slice other than a slice current; a slice address that indicates a starting position for the current slice; and an in
3/102 dependency statement that indicates whether the current slice is the dependent slice or not, where the dependent slice enable indicator is arranged in a set of parameters common to the slices, the slice address is arranged in a slice header of the current slice, and the dependency indication is placed in the slice header, and is placed before the slice address and after a syntax element to identify the parameter set.
[0011] An image decoding method, in accordance with an aspect of the present invention, is an image decoding method for performing decoding processing by partitioning a picture into a plurality of slices, the image decoding method comprises extracting, from an encoded bit stream, a dependent slice enable indicator indicating whether or not the figuration includes a dependent slice on which decoding processing is performed depending on a result of decoding processing on a different slice than a current slice, a slice address indicating a starting position for the current slice, and a dependency indication indicating whether or not the current slice is the dependent slice, where the dependent slice enable indicator is arranged in a parameter set common to the slices, the slice address is arranged in a slice header of the current slice, and the indication of depe ndence is arranged in the slice header, and is arranged before the slice address and after a syntax element to identify the parameter set.
[0012] The general and specific aspects revealed above can be implemented using a system, a method, an integrated circuit, a computer program or a computer-readable recording medium such as a CD-ROM, or any combination of systems, methods, integrated circuits, computer programs or computer-readable recording media.
4/102
ADVANTAGE EFFECTS OF THE INVENTION [0013] An image encoding method and an image decoding method, in accordance with the present invention, have the ability to increase the encoding efficiency.
BRIEF DESCRIPTION OF THE DRAWINGS [0014] These and other objectives, advantages and resources of the disclosure will become apparent from the following description obtained in conjunction with the accompanying drawings that illustrate a specific embodiment of the present invention.
[0015] Figure 1 is a block diagram showing an example of an encoder that is compatible with HEVC.
[0016] Figure 2 is a block diagram showing an example of a decoder that is compatible with HEVC.
[0017] Figure 3 is a diagram showing an example of an image configuration in parallel wavefront processing (WPP).
[0018] Figure 4 is a diagram showing an example of a relationship between a normal slice and a dependent slice in parallel wavefront processing.
[0019] Figure 5 is a diagram showing an example of a packet header.
[0020] Figure 6 is a diagram showing an example of a slice header for an entropy slice or a dependent slice.
[0021] Figure 7 is a diagram showing dependencies and signal transmission when a normal slice is used.
[0022] Figure 8 is a schematic view showing dependencies and signal transmissions when a dependent slice and an entropy slice are used.
[0023] Figure 9A is a diagram showing an example which shows an implementation of syntax of inter-bed dependencies5 / 102 da, temporal dependencies and interpath dependencies in HM8.0.
[0024] Figure 9B is a diagram to explain analysis steps to be performed to analyze the interlayer dependencies in HM8.0.
[0025] Figure 90 is a diagram to explain analysis steps to be performed to analyze the interlayer dependencies in HM8.0.
[0026] Figure 10 is a diagram showing an example of the position of the dependent__slice__flag.
[0027] Figure 11 is a diagram showing an example of syntax when the analysis condition in relation to dependentshceenabledfiag in Figure 10 is removed.
[0028] Figure 12 is a diagram showing an example of syntax when the dependentsliceílag is moved before firstslicejnpicflag.
[0029] Figure 13 is a diagram showing an example of syntax when the dependent slice flag is moved before the slice address syntax element.
[0030] Figure 14 is a diagram showing an example of syntax when the dependent shce flag is moved within the NAL header.
[0031] Figure 15 is a diagram showing an example of a header slice syntax for a dependent slice when a new type is added to NAL unit types used for dependent slices.
[0032] Figure 16 is a diagram showing an example of syntax of a slice header and a NAL unit header when it is assumed that dependent_sliceJ1ag is set to 1 for certain types of NALU.
[0033] Figure 17 shows a general configuration of a system
6/102 content delivery to deploy content delivery services.
[0034] Figure 18 shows a general configuration of a digital broadcasting system.
[0035] Figure 19 shows a block diagram that illustrates an example of a television configuration.
[0036] Figure 20 shows a block diagram that illustrates an example of a configuration of an information reproduction / recording unit that reads and records information from and on a recording medium that is an optical disc.
[0037] Figure 21 shows an example of a configuration of a recording medium that is an optical disc.
[0038] Figure 22A shows an example of a cell phone.
[0039] Figure 22B is a block diagram showing an example of a cell phone configuration.
[0040] Figure 23 illustrates a multiplexed data structure.
[0041] Figure 24 shows, schematically, how each flow is multiplexed in multiplexed data.
[0042] Figure 25 shows how a video stream is stored in a PES packet stream in more detail.
[0043] Figure 26 shows a structure of TS packets and source packets in the multiplexed data.
[0044] Figure 27 shows a data structure for a PMT.
[0045] Figure 28 shows an internal structure of multiplexed data information.
[0046] Figure 29 shows an internal structure of flow attribute information.
[0047] Figure 30 shows steps to identify video data.
[0048] Figure 31 shows an example of an integrated circuit configuration to implement the figu coding method
7/102 mobile ration and the mobile figuration decoding method according to each of the modalities.
[0049] Figure 32 shows a configuration for switching between drive frequencies.
[0050] Figure 33 shows steps to identify video data and switch between trigger frequencies.
[0051] Figure 34 shows an example of a look-up table in which video data patterns are associated with trigger frequencies.
[0052] Figure 35A is a diagram showing an example of a configuration for sharing a module of a signal processing unit.
[0053] Figure 35B is a diagram showing another example of a configuration for sharing a module of the signal processing unit.
DESCRIPTION OF THE MODALITIES [0013] (Underlying Knowledge Formation Base of the Present Revelation) [0054] In relation to the image encoding method and the image decoding method described in the Background section, the inventors found the following problem.
[0055] First, an image encoding device and an HEVC image decoding device will be described.
[0056] A video signal inserted in an image encoding device is a sequence of images called frames (pictures). Each frame includes a two-dimensional array of pixels. All of the standards mentioned above based on hybrid video encoding include partitioning each individual video frame into smaller blocks, including a plurality of pixels. The size of the blocks can vary, for example, according to the content of the image. O
8/102 encoding method can typically be varied on a per block basis. The largest possible size for such a block, for example, in HEVC, is 64 x 64 pixels. It is called the largest coding unit (LCU). The LCU can be recursively partitioned into 4 CUs.
[0057] In H.264 / MPEG-4 AVC, a macroblock (which normally denotes a 16 x 16 pixel block) was the basic image element, for which encoding is performed. The macroblock can be further divided into smaller sub-blocks. The encoding steps included in the encoding method and / or the decoding steps included in the decoding method are performed on a per-block basis.
1-1. HYBRID VIDEO ENCODING [0058] The following simply describes hybrid video encoding.
[0059] Typically, the encoding steps of a hybrid video encoding include a spatial and / or temporal forecast (space prediction and / or weather prediction). Thus, each block to be encoded is first predicted using either blocks in their spatial neighborhood or blocks in their temporal neighborhood, that is, video frames previously encoded. A residual block that is a difference between the block to be coded and the result of its prediction is then calculated. Then, the residual block is transformed from the spatial domain (pixel) into a frequency domain. The transformation aims to reduce the correlation of the input block.
[0060] In addition, the transform coefficients obtained from the transformation are quantized. This quantization is lossy (irreversible) compression. Normally, the compressed transform coefficient values are further compressed without loss by an entropy coding. In addition, information
9/102 auxiliaries necessary for the reconstruction of the encoded video signal are encoded and supplied together with the encoded video signal. These are, for example, information about spatial prediction, temporal prediction and / or quantization.
1-2. IMAGE ENCODING EQUIPMENT CONFIGURATION [0061] Figure 1 is an example of a typical H.264 / MPEG-4 AVC and / or HEVC image encoding device (encoder 100).
[0062] As shown in Figure 1, encoder 100 includes a subtractor 105, a transformation unit 110, a quantization unit 120, an inverse transformation unit 130, an adder 140, an deblocking filter 150, an adaptive filter of loop 160, a memory frame 170, a prediction unit 180 and an entropy encoder 190.
[0063] Prediction unit 180 triggers a prediction signal s2 by temporal prediction or spatial prediction. The type of prediction used in the prediction unit 180 can be varied on a per frame basis or on a per block basis. Temporal prediction is called interpredition, and spatial prediction is called intraprediction. Coding using a prediction signal s2 by time prediction is called intercoding, and coding using a prediction signal s2 by spatial prediction is called intracoding. In deriving a prediction signal using time prediction, encoded images stored in a memory are used. In deriving a prediction signal using spatial prediction, a boundary pixel value of a neighboring encoded or decoded block stored in a memory is used. The number of prediction directions in intraprediction depends on the size of the coding unit (CU). It should be noted that the prediction details will be described later.
10/102 [0064] Subtractor 105 first determines a difference (prediction error signal e) between a current block to be encoded from an input image (--- input signal s1) and a corresponding prediction block (- prediction signal s2). The difference is used for the prediction of the current block to be coded. It should be noted that the prediction error signal is also called a residual prediction signal.
[0065] The transformation unit 110 transforms a signal of prediction error and coefficients. In general, transformation unit 110 uses an orthogonal transformation such as a two-dimensional discrete cosine (DCT) transformation or an integer version thereof. Orthogonal transformation can efficiently reduce the correlation of the input signal s1 (the video signal before encoding). After transformation, lower frequency components are usually more important for image quality than high frequency components, so more bits can be spent to encode low frequency components than high frequency components.
[0066] The quantization unit 120 quantizes coefficients and derives quantized coefficients.
[0067] The entropy encoder 190 performs entropy coding on the quantized coefficients. Quantized coefficients are compressed without loss by entropy coding. In addition, by entropy coding, the volume of data stored in memory and the volume of data (bit stream) to be transmitted can be further reduced. Entropy coding is performed by essentially coding using a variable length codeword. The length of a codeword is chosen based on the probability of its occurrence.
[0068] Entropy encoder 190 turns the two-dimensional matrix of quantized coefficients into a one-dimensional array. Tipi
11/102 the entropy encoder 190 performs this conversion through a so-called zigzag scan. The zigzag scan starts with the DC coefficient in the upper left corner of the two-dimensional array and scans the two-dimensional array in a predetermined sequence ending with an AC coefficient in the lower right corner. The energy is typically concentrated in the upper left part of the two-dimensional coefficient matrix. In general, when coefficients are located in the upper left corner, they are low frequency component coefficients. When coefficients are located in the lower right corner, they are high frequency component coefficients. Therefore, the results of scanning in a zigzag pattern in an arrangement where the last values are usually consecutively a plurality of ones or zeros. This allows efficient coding with the use of run length codes as a part of / before actual entropy coding.
[0069] H.264 / MPEG-4 AVC and HEVC use different types of entropy coding. Although some syntax elements are hard-coded, most syntax elements are hard-coded. In particular, among syntaxes, adaptive context variable length codes (CABAC) are used to encode prediction error signals (residual prediction signals). In general, several other integer codes other than adaptive context variable length codes are used for encoding other syntax elements. However, adaptive context binary arithmetic coding can be used.
[0070] Variable length codes allow compression without loss of the encoded bit stream. However, since code words are variable in length, decoding must be performed
12/102 of sequentially in the code words. In other words, it is not possible to encode or decode codewords before encoding or decoding the previous codewords without restarting (initializing) entropy coding or without separately indicating a position of the codeword (starting point) to start at decode. [0071] Arithmetic coding encodes a sequence of bits into a single codeword based on a predetermined probability model. The predetermined probability model is determined according to the content of the video sequence in the case of CABAC. Arithmetic encoding, and thus also CABAC, are more efficient when the length of the bit stream to be encoded is greater. In other words, CABAC applied to bit strings is efficient for larger blocks. At the beginning of each sequence, CABAC is restarted. In other words, at the beginning of each video sequence, its probability model is initialized with some predefined or predetermined values.
[0072] Entropy encoder 109 transmits, to a decoder side, a bit stream which includes encoded quantized coefficients (encoded video signals) and encoded auxiliary information.
[0073] H.264 / MPEG-4 and H.264 / MPEG-4 AVC as well as HEVC include two functional layers, a Video Encoding Layer (VCL) and a Network Abstraction Layer (NAL). The VCL provides the encoding functionality as described above. NAL encapsulates information elements in standardized units called NAL units according to their additional application such as transmission over a channel or storage in a storage device. The information elements encapsulated by the NAL are, for example, (1) the encoded prediction error signal (compressed video data) or (2) other information
13/102 necessary for decoding the video signal such as prediction type, quantization parameter, motion vectors, etc. There are VCL NAL units containing compressed video data and related information, as well as non-VCL units that encapsulate additional data such as a parameter set related to an entire video sequence or Supplemental Enhancement Information (SEI) that provides additional information that can be used to improve decoding performance.
[0074] Some non-VCL NAL units include, for example, parameter sets. A parameter set is a set of parameters related to the encoding and decoding of a given portion of the video sequence. For example, there is a sequence parameter set (SPS) that includes a relevant parameter for the encoding and decoding of the entire picture sequence. In particular, the string parameter set is a syntax structure including syntax elements. In particular, the syntax elements are applied to zero or more fully encoded video streams as determined by the contents of a setJd seq parameter. The seq parameter set id is a syntax element included in the figuration parameter set (described below) referred to by pic_parameter__setjd. The plc parameter set jd is a syntax element included in each slice header.
[0075] The figuration parameter set (PPS) is a parameter set that defines parameters applied to the encoding and decoding of a figuration of a figuration sequence (video sequence). In particular, PPS is a syntax structure that includes elements of syntax. The syntax elements are applied to zero or more fully coded figures as determined by pic__parameter__setjd which is a syntax element found in each slice header.
14/102 [0076] Thus, it is simpler to keep track of an SPS than of the PPS. This is because PPS changes for each picture, while SPS is constant for the entire video sequence that can be even minutes or hours in length.
[0077] Encoder 100 includes a reconstruction unit (so-called decoding unit) that derives a reconstructed signal (so-called a decoded signal) s3. By the reconstruction unit, a reconstructed image obtained by reconstructing (decoding) the encoded image is generated and is stored in frame memory 170.
[0078] The reconstruction unit includes the reverse transformation unit 130, the adder 140, the deblocking filter 150, and the adaptive loop filter 160.
[0079] The reverse transformation unit 130, according to the coding steps described above, performs reverse quantization and reverse transformation. It should be noted that the prediction error signal e ! derived from the reverse transformation unit 130 is different from the prediction error signal and due to the quantization error, also called quantization noise.
[0080] Adder 140 derives a reconstructed signal s 'by adding a reconstructed and reconstructed prediction error signal' by the reverse transformation unit 130 to a prediction signal s2.
[0081] The deblocking filter 150 performs deblocking filter processing to reduce quantization noise that superimposes the reconstructed signal ! due to quantization. Here, since the coding steps described above are performed on a per block basis, there is a case where a block boundary is visible when noise is over-imposed (noise blocking characteristics). The superimposed noise is called blocking noise. In particular, when strong quantization is performed by quantization unit 120, there are more limits
15/102 blocks visible in the reconstructed image (decoded image). Such blocking noise has a negative effect through human visual perception, which means that a person feels that the image quality is deteriorated. In order to reduce blocking noise, the deblock filter 150 performs deblock filter processing on each reconstructed signal ! (reconstructed block).
[0082] For example, in H.264 / MPEG-4 AVO deblocking filter processing, for each area, a filter processing suitable for the area is selected. In the case of a high degree of blocking noise, a strong low-pass filter (narrow band) is applied, while for a low degree of blocking noise, a weaker low-pass filter (wide band) is applied. The strength of the low-pass filter is determined by the prediction signal e2 and the prediction error signal e '. Deblocking filter processing generally smooths the block corners. This results in an improved subjective image quality of the decoded signals. The filtered image is used for the motion compensated prediction of the next image. Since filter processing also reduces prediction errors, coding efficiency can be improved.
[0083] Adaptive loop filter 160 applies adaptive sample displacement processing (SAO) and / or adaptive loop filter processing (ALF) to the reconstructed image after processing the deblock filter on the deblock filter 150, for derive a reconstructed signal (decoded signal) s3.
[0084] The deblocking filter processing in the deblocking filter 150 aims to improve the subjective quality. Meanwhile, ALF processing and SAO processing in the adaptive loop filter 160 aim to improve pixel-wise (objective quality) fidelity. SAO processing is used to add an offset value to a pixel value for each
16/102 pixel with the use of a pixel value of the immediately neighboring pixel. Ο ALF processing is used to compensate for image distortion caused by compression. Typically, the filter used in ALF processing is a Wiener filter with filter coefficients determined so that the mean square error (MSE) between the reconstructed signal s' and the input signal s1 is minimized. The filter coefficients of ALF processing are calculated and transmitted on a frame-by-frame basis, for example. ALF processing can be applied to the entire frame (image) or to local areas (blocks). Auxiliary information indicating which areas to be filtered can be transmitted on a block-by-block basis, a frame-by-frame basis, or a quadtree by quadtree base.
[0085] Frame memory (frame temporary storage) 170 stores part of the encoded and reconstructed (decoded) image (reconstructed signal s3). The reconstructed and stored image is used to decode an intercoded block.
[0086] Prediction unit 180 derives a prediction signal s2 using the (same) signal that can be used both on the encoder side and on the decoder side, in order to maintain compatibility between the encoder side and the decoder. The signal that can be used on both the encoder and decoder sides is a reconstructed signal s3 (video signal after filter processing by the loop adaptive filter 160) on the encoder side which is encoded and then reconstructed (decoded) , and a reconstructed signal s4 (video signal after filter processing by the adaptive loop filter in Figure 2) on the decoder side which is decoded from a bit stream.
[0087] Prediction unit 180, when generating a prediction signal s2 by intercoding, predicts using prediction by motion compensation. A motion estimator of the prediction unit
17/102 tion 180 (not shown) finds a better matching block for the current block from the blocks within the previously encoded and reconstructed video frames. The best match block then becomes a prediction signal. The relative displacement (movement) between the current block and its best matching block is then signaled as motion data included in the auxiliary information in the form of three-dimensional motion vectors. The signal is transmitted together with the encoded video data. The three-dimensional motion vector includes two spatial-dimensional motion vectors and a temporal-dimensional motion vector. In order to optimize the prediction accuracy, the motion vectors can be determined with a spatial sub-pixel resolution, for example, half a pixel resolution or a quarter pixel. A motion vector with spatial subpixel resolution can point to a spatial position within an already reconstructed frame in which no real pixel value is available, that is, a subpixel position. Therefore, spatial interpolation of such pixel values is necessary in order to perform motion compensated prediction. This can be achieved by an interpolation filter (integrated within the prediction unit 180 in Figure 1).
1-3. IMAGE DECODING DEVICE CONFIGURATION [0088] A configuration of a decoder (image decoding device) will be described with reference to Figure 2.
[0089] Figure 2 is a block diagram showing an example of a decoder 200 according to the H.264 / MPEG-4 AVC or HEVC video encoding standard.
[0090] As shown in Figure 2, the decoder 200 includes an entropy decoder 290, an inverse transformation unit 230, an adder 240, an deblocking filter 250, an
18/102 adaptive loop 260, a frame memory 270 and a prediction unit 280.
A bit stream inserted into the decoder 200 (encoded video signal) is first transmitted to the entropy decoder 290.
[0092] Entropy decoder 290 extracts the coded quantized coefficients from the bit stream and the coded auxiliary information, and decodes the coded quantized coefficients and the coded auxiliary information. Auxiliary information, as described above, information needed for decoding such as motion data (motion vector) and prediction mode (prediction type).
[0093] The entropy decoder 290 transforms the decoded quantized coefficients in a one-dimensional array into those in a two-dimensional array by inverse scanning. The entropy decoder 290 inserts, in the inverse transformation unit 230, the quantized coefficients after being transformed into those in a two-dimensional arrangement.
[0094] The inverse transformation unit 230 performs inverse quantization and inverse transformation on the quantized coefficients transformed into those in a two-dimensional arrangement, to derive a prediction error signal e ! . The prediction error signal e 'corresponds to the differences obtained by subtracting the prediction signal from the signal inserted in the encoder in case no quantization noise is introduced and no error has occurred.
[0095] Prediction unit 280 triggers a prediction signal s2 by temporal prediction or spatial prediction. Information such as the type of prediction included in the auxiliary information is used in the case of intraprediction (spatial prediction). In addition, information such as movement data included in auxiliary information
19/102 are used in the case of motion-compensated prediction (interpredition, time prediction).
[0096] Adder 240 adds a prediction error signal e 'obtained from the reverse transformation unit 230 and a prediction signal e2 obtained from prediction unit 280, to derive a reconstructed signal s'.
[0097] The deblock filter 250 performs deblock filter processing on a reconstructed signal ! . The adaptive loop filter 260 applies SAO processing and ALF processing to the reconstructed signal s to which deblocking filter processing is applied by deblocking filter 250. A decoded signal S4 obtained from the application of SAO processing and ALF processing in the adaptive loop filter 260 it is stored in frame memory 270. The decoded signal S4 stored in frame memory 270 is used, in prediction unit 280, to predict the next current block to be decoded or the current image to be decoded.
1-4. PROCESSING EFFICIENCY [0098] In general, parallel processing is considered in order to improve the processing efficiency of encoding processing and decoding processing.
[0099] Compared to H.264 / MPEG-4 AVC, HEVC has a function to support high level parallel processing (parallel processing) of encoding and decoding. In HEVC, you can slice a frame, similar to H.264 / MPEG-4 AVC. Here, slices are groups of LCUs in the scan order. In H.264 / MPEG-4 AVC, slices are independently decodable, and no spatial prediction is applied between the slices. Therefore, parallel processing can be performed on a slice-by-slice basis.
[00100] However, since the slices have significantly large headings and there is a lack of dependencies between the slices, the efficiency
10/202 compression rate is reduced. Furthermore, CABAC encoding loses efficiency when applied to small blocks of data.
[00101] In order to enable more efficient parallel processing, parallel wavefront processing (WPP) is proposed. WPP maintains a constant dependency that is different from parallel processing in which each slice is independent.
[00102] The following description will be given referring to the case in which a figure comprises the LCUs each in which figures are arranged in a matrix and each LCU line comprises a slice (refer to Figure 3). In WPP, among the LCUs comprising the current LCU line 32, such as the CABAC probability model to reconfigure the CABAC state of the first LCU (upside LCU), the CABAC probability model shortly after processing in the second LCU of the LCU line prior to being completed is used. All interblock dependencies are maintained. This allows parallelization of decoding of the LCU lines. The timing to start each LCU line processing is delayed by two LCUs compared to the previous one. Information about the starting points for starting LCU line decoding is included in the slice header. WPP is described in detail in Non-Patent Literature 1.
[00103] Another approach to improve parallelization is called tiles. Thus, a frame (figuration) is partitioned into tiles. Tiles are rectangular groups of LCUs. The limits between the tiles are stipulated so that the entire figuration is partitioned into a matrix. Tiles are processed in the order of raster scanning.
[00104] All dependencies are broken at the tile limits. Entropy coding such as CABAC is also reconfigured at the beginning of each tile. Only deblocking filter processing and adaptive sample displacement processing can be applied over tile boundaries. In this way, tiles can be codified
21/102 and decoded in parallel. Tiles are described in detail in Non-Patent Literature 2 and Non-Patent Literature 3.
[00105] Furthermore, in order to improve the concept of slices and make them suitable for parallelization instead of error resilience which was the original purpose of slices in H.264 / MPEG-4 AVC, the concept of dependent slices and slices entropy has been proposed.
[00106] In other words, in HEVC, there are three types of slices supported: (1) normal slices; (2) entropy slices; and (3) dependent slices.
[00107] Normal slices denote already known H.264 / MPEG-4 AVC slices. No spatial prediction is allowed between normal slices. In other words, prediction about slice limits is not allowed. This means that a normal slice is encoded without referring to any other slice. In order to enable independent decoding of such slices, CABAC is restarted at the beginning of each slice.
[00108] When the slice to be processed is a normal slice, CABAC restart includes final processing (termination processing) arithmetic coding processing or arithmetic decoding processing at the end of the preceding slice, and processing to initialize the context table ( probability table) to a default value at the beginning of the normal slice.
[00109] Normal slices are used at the beginning of each frame. In other words, every frame has to start with a normal slice. A normal slice has a header that includes parameters needed to decode the slice data.
[00110] The term entropy slices denotes slices in which spatial prediction is allowed between the parent slice and the entropy slice. The analysis of the parent slice and the entropy slice is performed independently.
[00111] However, the parent slice must be, for example, a normal slice immediately before the entropy slice. Parent slice is required
22/102 for the reconstruction of the entropy slice pixel values. In order to enable independent analysis of entropy slices, CABAC is also restarted at the beginning of the slice. Like the entropy slice slice header, you can use a slice header that is smaller than the normal slice slice header. The slice header of entropy slices includes a subset of encoding parameters in relation to the information transmitted within the header of a normal slice. The elements missing from the entropy slice header are copied from the parent slice header.
[00112] When the slice to be processed is an entropy slice, restarting CABAC, as similar to the normal slice, includes final processing (termination processing) at the end of the preceding slice, and processing to initialize the context table (table probability) to a default value at the beginning of the current slice.
[00113] (3) The dependent slice is similar to an entropy slice, but it is partially different in the processing in which CABAC is restarted.
[00114] When the slice to be processed is a dependent slice and WPP is not effective, restarting CABAC includes final processing on the preceding slice (termination processing) and processing to initialize the context table to a state value of the end of the preceding slice. When the slice to be processed is a dependent slice and WPP is ineffective, restarting CABAC includes final processing on the preceding slice (termination processing) and processing to initialize the context table to a state value after the LCU processing it belongs to to the preceding slice and is the second from the left end at the beginning of the current slice.
[00115] As described above, restarting CABAC always includes termination processing. Conversely, at the restart of CABAC, the status of CABAC is normally continued.
23/102 [00116] Dependent slices cannot be analyzed without a parent slice. Therefore, dependent slices cannot be decoded when the parent slice is not received. The parent slice is usually a preceding slice of the dependent slices in an encoding order, and a slice that includes a complete slice header. This is the same for the parent slice of the entropy slice.
[00117] As described above, entropy-dependent slices use the slice header (in particular, the slice header information that is missing from the dependent slice header) of the immediately preceding slice according to the slice coding order. This rule is applied recursively. The parent slice that the current dependent slice depends on is recognized as available for reference. The reference includes use of spatial prediction between slices, sharing of CABAC states and the like. A dependent slice uses the CABAC context tables that are generated at the end of the immediately preceding slice. In this way, a dependent slice does not initialize the CABAC tables to default values, but instead continues to use the context tables already developed. Additional details regarding the entropy and dependent slice can be found in the Non-Patent Literature 3.
[00118] HEVC provides several profiles. A profile includes some configurations of the image encoding device and the image decoding device suitable for a particular application. For example, the main profile only includes normal and dependent slices, but not entropy slices.
[00119] As described above, the encoded slices are additionally encapsulated in NAL units, which are additionally encapsulated, for example, in a Real Time Protocol (RTP) and finally in Internet Protocol (IP) packets. Either this or other protocol stacks, enable transmission of the encoded video
24/102 on packet-oriented networks, such as the Internet or some proprietary networks.
[00120] Networks typically include at least one or more routers, which employ special hardware that operates very quickly. The router's function is to receive IP packets to analyze their IP packet header and, consequently, forward the IP packets to their respective destinations. Since routers need to handle traffic from many sources, the packet handling logic needs to be as simple as possible. The minimum requirement for the router is to check the destination address field in the IP header to determine which route to take to forward them. In order to provide additional support for quality of service (QoS), intelligent (media-aware) routers additionally check specialized fields in network protocol headers, such as IP header, RTP header, and even the a NALU.
[00121] As can be seen from the description above of the video encoding, the different types of slices defined for the purpose of parallel processing, such as dependent slices and entropy slices, are of different importance in relation to quality distortion through their damage. In particular, dependent slices cannot be analyzed and decoded without a parent slice. This is due to the fact that at the beginning of the dependent slice, the entropy encoder or decoder cannot be restarted. Thus, the parent slice is more important for the reconstruction of the image or video.
[00122] In HEVC, the dependent slice and entropy slice introduce an additional dimension of dependence, namely, interpathic dependence (a dependency within the framework). This type of dependency is not considered by routers.
25/102 [00123] The dependencies described above and, in particular, the interpathic dependency are not considered at the network level. However, it would be desirable to take into account the dependency described above at the network level in order to provide better support for quality of service. Thus, it is necessary to improve the package handling flexibility at the network level considering the slice dependencies.
(PROBLEM DETAILS)
1-5. WPP AND DEPENDENT SLICE [00124] Dependent slices can be used together with parallel processing tools such as parallel waveform processing (WPP) and tiles. In particular, dependent slices make the wavefront (undercurrent) able to decrease the transmission delay without causing a loss of coding.
[00125] Furthermore, dependent slices serve as starting points for CABAC subcurrents since CABAC is not restarted on the dependent slices. In addition, the information indicating the starting points can be transmitted in the bit stream in order to provide the starting points for possibly independent analysis. In particular, if more than two CABAC subcurrents are encapsulated in a normal or dependent slice, start points are explicitly signaled in the form of the number of bytes per subcurrent. Here, the undercurrent denotes a portion of the flow that is independently parsable thanks to the start points. In addition, dependent slices can be used as starting point markers, as each dependent slice must have an NAL unit header. This means that the starting points can be signaled in relation to such markers.
[00126] The two approaches, namely, signaling the explicit starting point and marking the starting points by means of slices
26/102 dependents are used together.
[00127] As a rule, the starting point of each NAL unit (beginning of each NAL header) must be identifiable. There is no requirement on the exact identification operation. For example, the following two methods can be applied.
[00128] The first method is a method for placing a start code (for example, 3 bytes in length) at the beginning of each NAL header. The second method is a method of putting each NAL unit in a separate package. Due to the dependency on slices, the slice header size can be reduced.
[00129] In relation to entropy slices, the method allows analysis of parallel CABAC. This is due to the fact that CABAC is truly restarted at the beginning of the entropy slices. In the case of parallel CABAC processing, CABAC represents a bottleneck that can be overcome by parallel CABAC analysis followed by sequential pixel reconstruction operations. In particular, the WPP parallelization tool makes it possible to decode each LCU line by a processing core (intellectual property code (IP core), a function block). It should be noted that the assignment of the LCU lines to the cores can be different. For example, two lines can be assigned to one core, and one line can be assigned to two cores.
[00130] Figure 3 is a diagram showing an example of a configuration of a figure 300. In Figure 3, a figure 300 is subdivided into 31 to 3m (m is the ordinal number of LCU) lines of larger coding units ( LCU). Each of the LCU 3i lines (I = 1 am) comprises LCUs 3i1 to 3in (n is the ordinal number of LCU column) which are arranged in one line. The LCU 3i line corresponds to Trente wave i ss . Parallel processing can be performed for wave fronts. The CABAC status arrow in Figure 3 denotes a re
27/102 lation between LCU that refers to the status of CABAC and the reference destination.
[00131] Specifically, in Figure 3, first, among the LCUs included in the LCU 31 line, processing (encoding or decoding) begins for the LCU 311. head. The processing in the LCUs is performed in an order from LCU 311 to 31 n. After processing on the first two LCUs 311 and 312 on the LCU 31 line is performed, processing on the LCU 32 line is initiated. In the processing of the first LCU 321 row of the LCU 32 column, as shown in the CABAC status arrow in Figure 3, the CABAC status immediately after processing in the LCU 312 in the LCU line 31 in the first row is used as the initial state of the CABAC. In other words, there is a delay of two LCUs between the two parallel processes.
[00132] Figure 4 is a diagram showing an example of the case where a dependent slice using WPP is used. LCU lines 41 to 43 correspond to Wavefront 1, Wavefront 2 and Wavefront 3, respectively. LCU lines 41 to 43 are processed by the respective cores independent of them. In Figure 4, the LCU 41 line is a normal slice, and the LCU lines 42 to 4m are dependent slices.
[00133] The dependent slices make WPP able to reduce the delay. Dependent slices do not have a complete slice header. In addition, the dependent slices can be decoded independently of the other slices as long as the start points (or the dependent slices start point, which is known as a rule as described above) are known. In particular, dependent slices can make WPP suitable also for low delay applications without incurring a loss of coding.
[00134] In the usual case of encapsulating the subcurrents (LCU lines) in slices, it is mandatory to insert the explicit start points in the header
28/102 slice header to guarantee parallel entropy encoding and decoding. As a result, a slice is ready for transmission only after the last subcurrent of the slice has been fully encoded. The slice header is completed only after all subcurrents in the slice have been encoded. This means that the transmission of the beginning of a slice cannot be initiated by means of packet fragmentation at the RTP / IP layer until the entire slice has ended.
[00135] However, since the dependent slices can be used as start point markers, explicit start point signaling is not required. Therefore, it is possible to divide a normal slice into many dependent slices without loss of coding. Dependent slices can be transmitted as soon as encoding of the encapsulated subflow is completed (or even earlier in the case of packet fragmentation).
[00136] Dependent slices do not break the dependence on spatial prediction. Dependent slices don't even break the analysis dependency. This is due to the fact that the analysis of the current dependent slice commonly requires the CABAC states of the previous slice. [00137] When dependent slices are not allowed, then each LCU line can be configured to be a slice. Such a setting decreases the transmission delay, but at the same time leads to a somewhat high encoding loss as discussed in the Background section above.
[00138] Alternatively, the entire frame (picture) is encapsulated in a single slice. In this case, the starting points for the subcurrents (LCU lines) need to be signaled in the slice header in order to allow their parallel analysis. As a result, there is a transmission delay at the frame level. In other words, the header needs to be modified after the entire frame has been encoded.
10/292
Having an entire picture encapsulated in a single slice alone does not increase the transmission delay. For example, the transmission of some parts of the slice may start even before all encoding has ended. However, if WPP is used, then slice header needs to be modified postenorally in order to write the starting points. Therefore, the entire slice needs to be delayed for transmission. [00139] The use of dependent slices thus makes it possible to reduce the delay. As shown in Figure 4, a figure 400 is divided into an LCU 41 line which is a normal slice, and the LCU lines 42 to 4m which are dependent slices. When each LCU line is a dependent slice, a transmission delay for an LCU line can be achieved without any loss of coding. This is caused by the fact that the dependent slices do not break any spatial dependencies and do not restart the CABAC engine.
1-6. PACKAGE CONFIGURATION [00140] As described above, network routers have to analyze packet headers in order to provide quality of service. The quality of service is different according to the type of application and / or priority of the service and / or the relevance of the package to the distortion caused by the loss of its package. [00141] Figure 5 is a diagram showing an example of encapsulation (packaging) of a bit stream.
[00142] In general, the real-time protocol (RTP) is used for packaging. RTP is normally used for streaming media in real time. The header lengths of the respective protocols involved are basically fixed. Protocol headers have extension fields. Extension fields can extend the length of the headers by 4 bytes. For example, the IP header can be extended up to 20 bytes. The syntax elements in the IP, User Datagram Protocol (UDP) headers and
10/30
RTP are also fixed in their length.
[00143] Figure 5 shows a packet header 500 included in an IP packet. The packet header shown in Figure 5 includes an IP header 510, a UDP header 530, an RTP header 540, an RTP payload header H264 560, and an NAL 570 header. The IP header 510 is a header with a length of 20 bytes with an extension field 520 of 4 bytes. The payload of the IP packet is a UDP packet. The UDP packet includes a UDP header 530 with a length of 8 bytes and the UDP payload. The UDP payload is formed by the RTP packet. The RTP packet includes an RTP 540 header with a head length of 12 bytes and an extension field 550 of 4 bytes. The RTP packet can be selectively extended by the extension field. The RTP packet payload includes a special RTP H264 payload header 560 with a length of 0 to 3 bytes followed by a HEALC NAL 570 header which is 2 bytes in length. The NALU payload including the encoded video packet follows packet headers 500 (not shown in Figure 5).
[00144] Routers that have the capacity to provide an improved quality of service are called Media Recognition Network Elements (MANE). The Media Recognition Network Elements check some of the fields in the packet headers shown in Figure 5. For example, MANE is called temporaljd and included in the NAL 570 header, or the decoding order number included in the RTP 540 header can be verified at in order to detect losses and order of presentation of the received package contents. Routers (network elements) handle packets as quickly as possible in order to enable high throughput on the network. Logic is required to access the fields in the packet headers quickly and simply in order to keep the complexity of the network element processing low.
[00145] NALU is encapsulated by header 500. NALU can include slice data when a slice header is present.
[00146] Figure 6 is a diagram showing an example of a 600 header header syntax. The dependent__slice__flag 601 syntax element is a syntax element that indicates whether or not a slice is a dependent slice. This element of syntax can be used to identify the interference dependency. However, the slice header is the content of a NALU. The analysis of the syntax elements before the dependent slice Jlag 601 requires somewhat complicated logic. This is a level that cannot be considered efficiently by common routers as will be shown below.
[00147] As described above, a NALU includes information common to a plurality of slices such as parameter sets, or directly includes slices encoded with information needed for decoding included in the slice header. The syntax of a slice header used for an entropy slice or a dependent slice is exemplified in Figure 6. Figure 6 shows a table with a slice header structure. When the dependent slice flag syntax element is set to 1, all slices up to the first normal slice (a slice that is not an entropy slice and not a dependent slice) that precedes the current slice in the decoding order are required. When slices are not decoded, in general, the current dependent slice cannot be decoded. In some special cases, for example, the dependent slice can be decodable when some other signaled or derived side information is available. The dependent_slice__flag 601 syntax element is included approximately in the middle of the slice header. In addition, the slice header includes the number of CABAC subcurrents within the current slice signaled by the information element
32/102 num_entry_point_offsets 602 and the number of bytes in a subcurrent, signaled by the entry point offset [i] 603 syntax element. Here, the num__entry__point__offsets 602 information element corresponds to the number of entry points. In addition, I is an integer and an index that denotes particular entry points (entry point offsets). The number of bytes in a subcurrent denoted by entry „point _offseí [i] 603 enables easy navigation within the bit stream.
1-7. FIGURE DEPENDENCE [00148] As described above, there are several types of dependencies that result from the HEVC coding approach.
[00149] Figure 7 is a diagram showing the dependencies and the signs of the same in the case where only normal slices, that is, no dependent slices and no entropy slices are used. Figure 7 shows three figures 710, 720 and 730.
[00150] Figure 710 is a base layer figure carried in two VCL NALUs, namely Unit VCL NAL 1 and Unit VCL NAL 2. POC indicates the order in which the figures are to be rendered. VCL NALU includes a syntax element that indicates whether a figuration belongs to a base layer or an enhancement layer, and a temporal syntax element Jd. The syntax element that indicates whether a picture belongs to a base layer or an enhancement layer is transmitted under a state of being within the NAL 570 header of the packet header 500 shown in Figure 5. The temporaijd syntax element is also transmitted under a state of being within the header NAL 570. The syntax element temporaijd indicates the degree of dependence on the other figures. For example, figurations or slices encoded with temporalJd-O are decodable independently of other figurations / slices that have a larger temporal Jd. It should be noted that in HEVC, temporai jd is
33/102 flagged in the NAL header as nuh temporal jd phusl (see Figure 9A). In particular, Expression 1 below can be applied to the relationship between the temporaljd used in these examples and the element detoxing nuh temporal jd plusl.
[Equation 11 temporaljd = nuhJemporal Jdjdusl -1 (Expression 1) [00151] Slices with temporaljd ····! depend on temporaljd slices with a lower value. In other words, the value of temporaljd in this case is 0. In particular, the element of temporal syntax jd refers to the prediction structure of the figuration. In general, slices with a particular value of temporal jd depend only on slices with a value less than or equal to temporaljd.
[00152] Thus, a figure 710 in Figure 7 can be decoded first.
[00153] A figure 720 is an enhancement layer to the base layer of figure 710. Thus, there is a dependency that requires figure 720 to be decoded after decoding figure 710. Figure 720 includes two NALUs, namely VCL NAL 3 Unit and VCL NAL Unit 4. The two figures 710 and 720 have the same POC value of 0. This means that figures 710 and 720 belong to the same image to be displayed each time. The images comprise the base layer and the enhancement layer.
[00154] Figure 730 is a base layer that includes two NALUs, namely Unit VCL NAL 5 and Unit VCL NAL 6. Figure 730 has a POC value of 1. This means that the figure (portion) 730 must be displayed after figures 720 and 710. Furthermore, figure 730 has the value of temporaljd - 1. This means that figure 730 depends temporally on a figure with temporal jd = 0. Thus, based on the dependence signaled in the NAL header, figure34 / 102 tion 730 depends on figure 710.
[00155] Figure 8 is a diagram showing the dependencies (degree of dependence) and their signs in the case where dependent slice and no entropy slice are used. Figure 8 shows three figures 810, 820, and 830. Figure 8 differs from Figure 7 described above in which the dependencies of the dependent slice and entropy slice signaled within the slice header are added.
[00156] In Figure 7, the interlayer dependency is shown with the example of figures 710 and 720. In addition, the temporal dependency is shown in the example of figures 710 and 730. These dependencies are both signaled in the NAL header.
[00157] The interpathic dependence as shown in Figure 8 is inherent to the entropy and dependent slices. In particular, the base layer frame 810 and the enhancement layer frame 820 both have two slices. Of the two slices, one is a parent slice (normal slice) and the other is a child slice (dependent slice). In frame 810, the slice of Unit VCL NAL 1 is the parent slice of Unit VCL NAL 2. In frame 820, the slice of Unit VCL NAL 3 is the parent slice of Unit VCL NAL 4. As described above, the term parent slice of a dependent slice refers to a slice on which the dependent slice depends, that is, the slice on which the slice header information is used by the dependent slice. This is as a rule that the first preceding slice is a slice that has a complete header. The slice that has a complete header is a normal slice, not another dependent slice, for example.
[00158] The corresponding syntax of the NAL unit header and the slice header as currently used in HEVC and, in particular, in HM8.0 will be described with reference to Figure 9A.
[00159] Figure 9A is a diagram showing the syntax of an NAL unit header 910 and the syntax of slice header 920.
10/35
In particular, interlayer dependencies are planned (in the current standardization) to be signaled within the NAL unit header using the nuh__reserved__zero__6bits syntax element. Temporal dependencies are signaled using the nuh__temporaljd__plus1 syntax element. The slice header 920 includes a sign indicating the interpathy dependency indicator. The interpathy dependency indicator is indicated by the dependent slice flag syntax element. In other words, the interdependence dependency (for example, time dependency) is signaled within the slice header, somewhere in the slice header.
[00160] In order to analyze this syntax element, all syntax elements that precede dependent__slice_flag must be analyzed as well as the parameter set syntax elements necessary to analyze the slice header syntax elements that precede the dependent__slice__flag.
1-8. ROUTER PROCESSING [00161] As described above, in determining traffic conformation, it is desirable to take into account the dependencies introduced by the dependent and entropy slices, in addition to the dependencies signaled in the NAL header. For example, a router can be deployed as a mobile media recognition base station. The bandwidth on the downlink is very limited and needs to be managed very carefully. The following example case is assumed. It is assumed that a packet is canceled at random upstream by a normal router. In this case, a media recognition network element (MANE) verifies the packet loss by checking the packet number. After verifying the packet loss, MANE cancels all packages that are dependent on the canceled package and the one that follows. This is a desirable feature for network-aware media elements. That way, packages can be canceled
36/102 of them more intelligently. When a router is determined to cancel an NAL unit, it will immediately deduce that the following dependent slices need to be canceled as well. In the current syntax introduced in Figure 9A, accessing dependent__slice_flag requires analysis of a considerable amount of information. This is not essential for traffic shaping or packet routing operations on routers. All the information that is necessary to verify the interlayer and intertemporal relationships is present in the video parameter set. The video parameter set is the largest set in the parameter set hierarchy.
[00162] Thus, the information described above is signaled within the NAL 570 header. However, in the case of the NAL header and the slice header shown in Figure 9A, accessing slice dependency information requires keeping track of additional parameter sets such as PPS and SPS. This, on the other hand, reuses the capability of media recognition routers or gateways. As seen in Figure 9A, slice header 920 has to be analyzed up to dependent_slicejlag and the analyzed parameters are unusable for network operation.
[00163] In order to have the ability to parse the slice address that precedes the dependent „sHce„ flag, the following syntax elements are required from the syntax elements included in the SPS930 as shown in Figure 9B. Figure 9B is a diagram showing an example of syntax included in the SPS.
[00164] * pic__widthjnjuma__samples (reference signal 931 in
Figure 9B) [00165] 15 pic__heightjnjuma_samples (reference signal 932 on
Figure 9B) [00166] s Iog2__min__coding__block__size__minus 3 (reference signal 933 in Figure 9B)
37/102 [00167] * Iog2_diff_max_min__coding_block__size (reference signal
934 in Figure 9B) [00168] These parameters are shown in the table to the right of Figure 9B and are required to obtain slicejaddress parameter. The sHce adress syntax element is of variable encoded length (as can be seen when viewing length v in the descriptor, second column, siice address and slice header 920 in Figure 9A). In order to know the length of this variable-length encoded parameter, those SPS syntax elements are required. In fact, in order to have the ability to parse the dependentsliceflag, the actual value of the slice address syntax element is not required. Only the length of the syntax element that is variable should be known so that the analysis process can continue.
[00169] Therefore, the SPS needs to be analyzed up to point 935 of the syntax elements within SPS 930 shown in Figure 9B. The four syntax elements are required to be stored. They are later used in a formula to calculate the length of the slice address syntax element.
[00170] Furthermore, in order to access the dependent slice enabled fiag which also precedes the dependent_slice_flag, the PPS needs to be analyzed up to point 945 from the syntax elements within the PPS shown in Figure 9C. Figure 9C is a diagram showing an example of syntax included in the PPS. It should be noted that the syntax elements whose methods of analysis are described with reference to Figures 9A to 9C and which are located within the slice header and the SPS and PPS are not required for common router operations. In addition, some of the syntax elements cannot simply be omitted since some of the syntax elements are encoded with variable length codes. So, even if
38/102 jump is performed in the bit stream by a predefined amount of bits, skipping until the dependent slice enabled flag is not possible.
[00171] In other words, in order to read the dependent__slice__flag (dependency indication), MANE needs to additionally go in the slice header (refer to slice header 920) whose analysis is somewhat complicated.
[00172] Specifically, the indicator first_sliceJn_picJ1ag has to be analyzed. The first slice jn pic flag indicator is an indicator that indicates whether or not a slice is the first slice within the picture.
[00173] So, in the output of prior pics flag whose presence is conditional in the NALU type, it has to be analyzed.
[00174] Furthermore, the encoded variable length pic parameter set id has to be decoded. The pic parameter set jd syntax element is a syntax element that indicates which of the parameter sets is used (a syntax element that identifies the parameter set). By analyzing pic parameter setJd, the parameter set to be used can be identified.
[00175] Finally, the shce address syntax element is required. The slice address syntax element is a syntax element that indicates the starting position of the slice. The syntax element additionally requires analyzing the PPS and SPS as well as additional computation.
[00176] As the last step, the dependent slice enabled flag value has to be obtained from the PPS, in order to know if the dependent___shce jiag is present in the bit stream or not. When dependent sHce enabled flag ~ 0, it means that the current slice is a normal slice since the dependent slices are not enabled. In order to obtain the value of dependent slice enabled fiag, the PPS is required to be analyzed up to approximately half of it.
[00177] Unfortunately, the syntax elements before they depend
39/102 dent „sHce„ flag cannot be omitted and need to be analyzed differently from the case of NAL and RTP header data, where the data position is predefined. This is caused by the fact that the syntax elements in the slice header are of variable encoded length. Therefore, the presence and length of the element must be computed for each VCL NAL unit. In addition, additional session data needs to be stored because it is needed later (refer to PPS and SPS). In addition, the presence of some syntax elements depends on the presence or value of other syntax elements possibly included in other parameter structures (the syntax elements are conditionally coded).
[00178] In the current standardization, there is a proposal to signal the dependency structure of the video sequence in the Video Parameter Set (VPS) that describes how many layers are contained in the bit stream and the dependency indicators to indicate the various dependencies of interlayer. The VPS is signaled at the very beginning of the video, before the first SPS. Multiple SPSs can refer to a single VPS. This means that a VPS carries information that is valid for multiple video streams. The main purpose of the VPS is to inform a router or decoder about the content of the video including information. How many video streams are included and how they are interrelated. SPS is valid only within a video stream while VPS carries information related to multiple video streams.
[00179] Furthermore, the characteristic of the information carried on the VPS is informative especially for routers. For example, the VPS can carry information that is required to transmit session configuration since the model is not finalized. The router analyzes the information on the VPS. The router, without the need for other sets
40/102 parameter (only viewing the NAL headers), you can determine which data packets to forward to the decoder and which packets to cancel.
[00180] However, in order to verify the currently active VPS, the following ordered steps need to be performed:
analyze the PPSJd in the slice header;
analyze the SPSJd in the active PPS determined by the PPSJd; and analyze the VPSJd in the active SPS determined by the SPSJd.
[00181] In order to solve the problem described above, an image encoding method, according to one aspect of the present invention, is an image encoding method for performing encoding processing breaking a picture into a plurality of slices, the An image encoding method comprises transmitting a bit stream that includes: a dependent slice enable indicator that indicates whether or not the picture includes a dependent slice on which the encoding processing is performed depending on a result of the encoding processing in a slice different from a current slice; a slice address indicating a starting position for the current slice; and a dependent slice flag indicating whether the current slice is the dependent slice or not, where the dependent slice enable indicator is arranged in a set of parameter common to the slices, the slice address is arranged in a slice header of the current slice, and the dependency indication is placed in the slice header, and is placed before the slice address and after a syntax element (pic parameter set jd) identifies the parameter set.
[00182] In the image encoding method described above, an indication of interdependence dependence is located in a suitable position for analysis by the router. With this, it is possible to code the dependency indication of the independent syntax
41/102 fearful, in other words, unconditionally, of other elements of syntax.
[00183] For example, the dependency indication can be included in the bit stream when the dependent slice enable indicator indicates inclusion of the dependent slice.
[00184] For example, the dependent slice enable indicator can be arranged at the beginning of the parameter set.
[00185] For example, each of the slices can include a plurality of macroblocks, and the encoding processing on the current slice can be started after the encoding processing is performed on two of the macroblocks included in an immediately preceding current slice.
[00186] For example, the dependency indication may not be included in a slice header of a slice that is processed first for figuration, between slices.
[00187] In order to solve the problem described above, an image decoding method, according to one aspect of the present invention, is an image decoding method for performing decoding processing by partitioning a picture into a plurality of slices, the The image decoding method comprises extracting, from a coded bit stream, a dependent slice enable indicator that indicates whether or not the picture includes a dependent slice on which the decoding processing is performed depending on a result of the processing of the image. decoding into a slice other than a current slice, a slice address indicating a starting position of the current slice, and a dependency indication indicating whether or not the current slice is the dependent slice, where the dependent slice enable indicator is arranged in a parameter set common to slices, the slice address is arranged in a header slice of the current slice, and the dependency indication
42/102 is arranged in the slice header, and is arranged before the slice address and after a syntax element to identify the parameter set.
[00188] For example, the dependency indication can be extracted from the bit stream when the dependent slice enable indicator indicates inclusion of the dependent slice.
[00189] For example, the dependent slice enable indicator can be arranged at the beginning of the parameter set.
[00190] For example, each of the slices can include a plurality of macroblocks, and the decoding processing on the current slice can be started after the decoding processing is performed on two of the macroblocks included in an immediately preceding current slice.
[00191] For example, the dependency indication may not be included in a slice header of a slice that is processed first for figuration, between slices.
[00192] In order to solve the problem, an image encoding apparatus, in accordance with an aspect of the present invention, is an image encoding apparatus that performs encoding processing by partitioning a picture into a plurality of slices, the image encoding comprises an encoder that transmits a bit stream that includes: a dependent slice enable indicator indicating whether or not the picture includes a dependent slice on which the encoding processing is performed depending on a result of the encoding processing in a slice different from a current slice; a slice address that indicates a starting position for the current slice; and a dependency indication that indicates whether the current slice is the dependent slice or not, where the dependent slice enable indicator is arranged in a parameter set common to the slices, the slice address is arranged in a header
43/102 slice slice of the current slice, and the dependency indication is displayed in the slice header, and is displayed before the slice address and after a syntax element identifying the parameter set.
[00193] In order to solve the problem, an image decoding apparatus, in accordance with an aspect of the present invention, is an image decoding apparatus that performs decoding processing by partitioning a picture into a plurality of slices, the image decoding comprises a decoder which extracts, from a coded bit stream, a dependent slice enable indicator that indicates whether or not the figure includes a dependent slice on which decoding processing is performed depending on a processing result of decoding on a slice other than a current slice, a slice address that indicates a starting position of the current slice, and a dependency indication that indicates whether or not the current slice is the dependent slice, where the dependent slice is arranged in a parameter set common to the slices, the slice address is arranged in a ca slice slice of the current slice, and the dependency indication is arranged in the slice header, and is arranged before the slice address and after a syntax element to identify the parameter set.
[00194] In order to solve the problem described above, an image encoding and decoding apparatus, in accordance with an aspect of the present invention, includes the image encoding apparatus described above and the image decoding apparatus described above.
[00195] According to the image encoding method, the image decoding method and the like that are set up above, an indication of interfacial dependency is located within the bitstream syntax related to a slice regardless of
44/102 other elements. The dependency indication is located, without unnecessarily analyzing other elements, separately from other elements. In the HEVC examples above, the interdependence indicator dependent slice flag is signaled in a location where it is not necessary to analyze elements of syntax irrelevant to the network operation.
[00196] Specifically, the present invention provides an apparatus for analyzing a bit stream of a sequence of video images encoded at least partially with a variable length code and including data units carrying encoded slices of video sequence. The apparatus comprises an analyzer to extract a dependency indication from the bit stream, which is a syntax element that indicates to a slice whether the variable length decoding or slice analysis depends or not on other slices, where the dependency indication is extracted from the bit stream independently of and without having to extract other syntax elements in advance.
[00197] Such apparatus can be included, for example, within the entropy decoder 290 in Figure 2. When referring to the extraction of the bit stream, the extraction and, when necessary for the extraction, an entropy decoding is intended. Entropy coding is variable length coding, for example, arithmetic coding such as CABAC. This is, in HEVC, applied to the encoding of image data. The data units here refer, for example, to NAL units or access units. The expression without the need to extract other elements of syntax refers to a situation in which the dependency indication is preceded only by elements, of which the length is known and of which the presence is known or conditioned in elements already analyzed or coded in a way not conditional at all.
45/102 [00198] The present invention additionally provides an apparatus for generating a bit stream of a video sequence encoded at least partially with a variable length code and which includes data units carrying encoded slices of video images The apparatus comprises a bit stream generator for incorporating into the bit stream a dependency indicator which is a syntax element that indicates for a slice whether the variable length decoding of the slice depends or not on other slices, where the dependency is incorporated into the bitstream independently of and without needing to incorporate other syntax elements in advance.
[00199] Such apparatus can be included, for example, within the entropy encoder 190 in Figure 1.
[00200] According to the image encoding method, the image decoding method and the like that are configured above, the bit stream includes encoded slice data and header data in relation to the slice, and the dependency indicator is located within the bit stream at the beginning of the slice header. This means that the slice header starts with the syntax elements indicating the slice dependency.
[00201] It should be noted that the dependency indication does not have to be located at the very beginning of the slice header. However, it is advantageous when no variable length coded syntax elements and / or any other conditionally coded elements precede the dependency indicator within the slice header.
[00202] For example, the current position of the dependent__slice__flag is changed in relation to the previous technique described above in order to be located at the beginning of the slice header. With this change, the reduction in the number of syntax elements that need to be analyzed is achieved. Complicated analysis operations of routers
46/102 are avoided, such as variable length decoding and information analysis that require additional computations and / or storage of additional parameters for future use and / or analysis of other parameter sets. In addition, the number of parameter sets that are required to be tracked is reduced.
[00203] Henceforth, modalities are specifically described with reference to the Drawings. Each of the modalities described below shows a general or specific example. The numerical values, formats, materials, structural elements, the arrangement and connection of the structural elements, steps, the order of processing of the steps etc. shown in the following embodiments are merely examples and therefore do not limit the scope of the present invention. Therefore, among the structural elements in the modalities below, structural elements not mentioned in any of the independent claims are described as arbitrary structural elements.
(MODALITY 1) [00204] Figure 10 shows an example of bitstream syntax according to the present modality. An NAL 1010 header shown in Figure 10 is the same as the NAL 910 header shown in Figure 9A. In other words, there is no change.
[00205] However, the syntax structure of slice header 1020 is different from the syntax structure of slice header 920 in Figure 9A. In particular, in slice header 1020, the dependent slice flag is moved upwards within the slice header in such a way that there is no syntax element preceding the dependent__slice__flag. The dependent__slice__flag is coded conditionally, is coded using a variable length code, or receives analysis that requires additional computation.
[00206] The syntax elements first__slicejn__pic__flag and dependent slice flag actually both determine the space dependencies
47/102. The syntax elements are encoded immediately after the NAL header in such a way that no other syntax elements need to be analyzed. Since first__slicejn__pic__flag also carries information that is related to interfacial dependencies, it can precede dependentsliceflag. The first__slice_in_pic__flag syntax element is an indicator that is defined according to the rule that each frame must start with a normal slice. Thus, when the first slice jn pic flag indicator is defined, it means that the slice is a normal and, therefore, independent slice. In this way, the dependent slice ... flag and the first slice in pic flag can both be seen together as an indicator of interfacial dependencies.
[00207] In other words, the dependency indicator can be set to include a first slice indication that indicates whether or not the slice is a first slice in a picture and a dependent slice indicator that indicates whether the variable length decoding of the slice depends or not on other slices. The first slice in a picture is always a slice for which variable length decoding does not depend on other slices.
[00208] Advantageously, the bit stream includes a dependent slice enable indicator indicating whether dependent slices can be included in the bit stream or not. The dependency indication is included in the bit stream only when the dependent slice enable indicator indicates which dependent slices can be included in the bit stream. The dependent slice enable indicator is located within the bit stream in a common parameter set for a plurality of slices and located at the beginning of the parameter set. The parameter set can be, for example, the figure parameter set that carries parameters for a single figure. Alternatively, the dependent slice enable indicator is located within a parameter set
48/102 sequence that carries parameters for the entire image (video) sequence.
[00209] However, in the present invention, the dependent__slice__flag (dependency indication) is encoded unconditionally in the syntax element dependentshceenabledflag (dependent slice enable indicator). In the present modality, since the figure parameter set id is located after the dependency indication, it is advantageous to avoid a possible parsing error in case the figure parameter set id is signaled within the slice header.
[00210] This change can also be seen as and / or challenged by changing the position of the other syntax elements required in the parameter sets or headers to reduce the amount of syntax elements that are required to be analyzed to determine dependencies between slices.
[00211] For example, the dependent slice Jag syntax element in the slice header of the present HM8.0 syntax is only present when the dependent_slice_enabled_flag syntax element value indicates that use of dependent slices within the bit stream is enabled. Enabling dependent slices and thus also the dependent_slice_enabled_flag ”syntax element is included in the PPS as shown in Figure 9C. Thus, the dependent__slice__enabled__flag syntax element in the PPS is moved upwards within the PPS syntax in order to simplify the analysis of the PPS necessary to analyze the dependent slice flag (for example, the beginning of the parameter set). This can also be useful when the dependent slice flag is encoded after the pic_parameter__setjd (the syntax element that identifies the parameter set). This is due to the fact that in doing so, the analysis error is avoided even when the dependent slice enable indicator is conditioning the
49/102 presence of the dependency indication.
[00212] Instead of moving the dependent slice enabled flag upward within the PPS, the dependent__slice__enabled__flag can be moved from the PPS to the SPS and / or VPS, so that parameter sets that are lower in the hierarchy to be traced.
[00213] In other words, according to the present modality, the position of the required syntax elements is changed in order to reduce the number of parameter sets from which traces need to be kept. This also reduces the complexity of analysis. The parameters required in this context mean the intended parameters that contribute to determining whether a slice is dependent interpath or not. A first possibility applicable directly to HEVC is to provide the dependency indication at the beginning of the dependent slice header and conditioned on the dependent slice enable indicator that is included in a different parameter set from the slice header. A second possibility directly applicable to HEVC is to provide the dependency indication in the dependent slice header after the parameter set indication identifies the parameter set in which the dependent slice enable indicator is included. The dependency indication can be conditioned on the dependent slice enable indicator. Upward movement of the dependent slice enable indicator within the PPS or movement of the dependent slice enable indicator for the SPS can be beneficial for any of these possibilities. In particular, this is beneficial for the second possibility, in which the dependent slice enable indicator is needed to analyze the dependency indication.
[00214] As can be seen in Figure 10, the NAL unit header, along with the relevant portion of the slice header, has 18 bits
50/102 (14 bits of the NALU header and 2 bits of the slice header). According to this example, a media recognition network element can operate for a current slice packet as follows. If a previous slice is canceled, which is a normal slice, an entropy slice or a dependent slice, the network element checks the first two bits of the current slice header, which are the first__slicejn__pic flag and (in the case that the dependent slices are allowed for the bitstream) dependentsHceflag.
[00215] When the NAL unit type is a VCL NAL unit type and the last two bits of the 18 bits checked are 01, the NAL unit is canceled. In particular, when the first bit of the slice header is 1, then it is the first slice in the picture that is (according to rules) not a dependent slice. When the first bit of the slice header is 0 and the next bit of the slice header is also 0, the slice is not dependent. Thus, only when the first two bits of the slice header are 01, the slice is dependent. In addition, the slice must be canceled, as it cannot be decoded when the parent slice has already been canceled. Thus, the first slice jn pic flag and dependent slice Jlag indicators can be seen as an extension of the NAL header, even if they belong to the slice header syntax.
[00216] Thus, the present modality also provides, as one of its aspects, a network router to receive, analyze and forward network packets to their destinations. The router includes a receiving unit for receiving a network packet including a packet destination address and a bitstream portion with encoded video data; an analyzer that includes the apparatus for analyzing a bit stream of an encoded video sequence according to any of the modalities mentioned above and below, in order to determine dependence on the encoded video data
51/102 of those of other packages; and a packet analyzer to analyze the received packet destination address and the determined dependency and to judge how to handle the network packet.
(MODE 2) [00217] According to Mode 2, dependent__slice__enabled__flag is canceled from PPS. It should be noted that dependent_slice_enabledJ1ag can be moved to SPS, instead of being canceled.
[00218] Figure 11 shows an example in which the dependent slice enabled Jag does not need to be analyzed before accessing first sHce jn pic flag and dependent sHce fiag [00219] In this example, dependent slice enabled flag is not used because it is not conditioned in presence of the dependency indication. This example provides the possibility of having the dependency indication at the beginning of the slice header without causing analysis problems due to the unknown identification of the current PPS set. (EFFECT OF MODALITY 2, ETC.) [00220] In Mode 1, in order to analyze the dependent slice flag, the dependent slice enabled flag has to be analyzed. The dependent slice enabled fiag is signaled in a PPS. This can cause some analysis overhead as discussed above, when the dependent slice enabled Jag is located far from the start of PPS and the preceding syntax elements are conditionally encoded.
[00221] In addition, the signaling of the dependent__slice__flag syntax element before the pic__parameter__setjd syntax element in the PPS is analyzed, can create parsing errors as follows. The presence of dependentsliceflag depends on the dependent__slice__enabled__flag that is signaled in the PPS. However, the currently active PPS id is signaled after the dependent slice flag. Per
52/102 so much, it is not possible to analyze the dependent „sHce„ tlag before accessing the previous elements.
[00222] Thus, it is advantageous to remove the analysis condition in the dependent „sHce ___ enabled„ ílag. It can be more beneficial when the following restriction is applied. Namely, if dependent sHce enabled flag in PPS is zero, then dependent__slice__flag must be equal to zero.
[00223] However, these advantageous deployments should not limit the scope of the present invention.
(MODIFICATION 1 OF MODALITIES 1 AND 2) [00224] As an alternative or additionally to remove the condition in the dependentsHceenabledflag, the dependent shce enabled flag can be moved from PPS to SPS and / or VPS. [00225] In addition, instead of just moving the dependent shce enabled flag, the dependent shce enabled flag can be duplicated in the SPS. In this case, the indicator in SPS and PPS can be forced to have the same value. Or the PPS may be allowed to overwrite the indicator in the SPS.
[00226] For example, when sps_dependent_slice_enabled_flag is equal to 1, then the pps dependent slice enabled flag can be 0 ου 1. Then, the sps dependent shce enabled flag is an indication of enabling dependent slices for a sequence of figures signaled in the SPS and the dependent slice enabled flag pps is an indication of enabling dependent slices for a figure signaled in the PPS. However, when the value of the dependent shce enabled flag may change in the PPS, it means that the analysis of the PPS is still necessary and the advantage of less frequent screening and analysis of PPS is prevented.
[00227] These modifications provide the advantage that VPS and SPS carry dependency structures. The size of dependency structures by VPS and SPS enables network elements to conform
53/102 the bit streams, that is, to be determined for the cancellation of the dependent packets that cannot be decoded in any way or for the cancellation of the dependent slices instead of the independent slices. In this way, dependent_slice_enabled_flag in VPS would trigger the router to check the slice header additionally or not.
[00228] It is observed that these modifications do not further reduce the complexity of the analysis if the example of Figures 10 and 11 is applied. However, they provide a more beneficial syntax structure for porting dependency structures. In summary, according to this example, an indicator to indicate whether the dependent slices are enabled for the bit stream or not is signaled in a set of video parameters. The video parameter set is a set of parameters that applies to more than one slice in more than one picture.
[00229] There are two different advantages of the dependent shce enabled flag signaling being VPS and / or SPS. When the indicator is only moved or duplicated, the PPS does not need to be analyzed, which reduces the analysis overhead. The other benefit is to let routers know the prediction structure of the video stream. This advantage is present all the time. Typically, a router can check the contents of a VPS / SPS to see what it will receive.
[00230] The VPS is the largest parameter in the hierarchy. The VPS can include information about multiple video streams, while the SPS and PPS are specific to a single video stream and a picture, respectively. The information in the VPS includes bit rate, temporal structure of video streams and the like. They also include information about interlayered dependencies (dependencies between different video streams). Consequently, the VPS can be viewed as a container
54/102 for multiple video streams and it gives an overview of each stream.
[00231] In the current HEVC version, the dependency between slices in a frame is established by both dependent__slice_flag and first slice jn pic flag. According to current specifications, network entities cannot use interfacial dependencies without applying a highly complex analysis. A direct solution would be, if there is a packet loss discovered through a lost packet number, to cancel all packets until the first__slicejn__pic__flag which is equal to a value of 1 is found. This is because the first slice in a picture is always a normal slice.
[00232] However, this solution leads to a reduction in the efficiency of the encoding. Thus, as described above, an inter-interface dependency signal that enables effective analysis can be used. This is achieved by signaling dependent__slice__flag and first slice jn pic flag within the slice header immediately after the NAL header.
[00233] Altematively or in addition, the syntax elements that refer to the interfacial dependencies are encoded unconditionally, that is, regardless of the other syntax elements that may be in the slice header or in the PPS.
(MODIFICATION 2 OF MODALITIES 1 AND 2) [00234] Figure 12 illustrates Modification 2 as an alternative to Modification 1 discussed above. In particular, the NAL 1210 unit header is the same as the NAL unit header shown in Figure 10 (NAL 910 unit header shown in Figure 9A). However, slice header 1220 and slice header 1020 shown in Figure 10 are different in that the slice header syntax elements dependent__slice__flag and first_slicejn__pic__flag have the order reversed. In particular, slice header 1220 includes that of
55/102 pendentshceflag as a first syntax element and the first slice in pic jlag syntax element as a second conditioned syntax element in the presence of dependent__slice__flag.
[00235] As can be seen from this example, a first slice indication that indicates whether or not the slice is a first slice in a picture is included in the syntax. A first slice in a picture is always a slice for which variable length decoding does not depend on other slices. In addition, the dependent slice indicator is included in the bit stream in front of the first slice indication. The first slice indication is included in the bit stream only when the dependent slice indicator does not indicate a dependent slice. This arrangement provides the same advantages as the condition. In other words, the dependency indicator is conditioned on the first slice indication. As can be seen in Figure 12, both elements can be understood as the dependency indication and are included at the beginning of the slice header.
(MODE 3) [00236] In Mode 3, compared to Modes 1 and 2, the method of arranging syntax elements is changed to reduce the analysis of unnecessary syntax elements.
[00237] In the modalities described above, the dependent slice flag is described in the case where the first__slicejn__pic flag is included as a condition for the presence of dependentsliceflag. However, the first slicejn pic flag and the dependent slice fiag can each be included in the bit stream without being conditioned to the presence of each other. For example, the dependent slice flag encoding method is changed to be independent of the dependent__slice__enabled__flag syntax element according to one of the modifications described above.
56/102 [00238] Figure 13 is a diagram showing an example of a slice header according to the present embodiment. Figure 13 illustrates the case of still including the condition of the dependency indication in the indicator that enables dependent slice.
[00239] Specifically, in the slice header according to the present modality, the dependent__slice__flag is arranged before the slice address in comparison to the existing slice header shown in Figure 6. In addition, in the slice header according to the present modality, compared to the examples in Figures 10 to 12, the dependent slice ... flag is displayed after the pic parameter set Jd.
[00240] In the present modality, since the dependent slice flag is displayed before the .address slice, at least the SPS does not need to be analyzed for the dependent slice flag analysis. As described above, the slice address is a syntax element that indicates the beginning of a slice. In addition, the slice__address can be analyzed only with the help of syntax elements signaled within the SPS (pic parameter set id).
[00241] Alternatively or in addition, the dependent slice enabled Jlag is either moved upwards within the PPS or is moved to the SPS and / or the VPS. If the enabled indicator is in the VPS and / or SPS, it may not be necessary to analyze and track the PPS and SPS.
[00242] (1) The example of Figure 13 can lead to the provision of an apparatus to analyze a bit stream of a video sequence encoded at least partially with a variable length code and which includes data units that carry image slices of video. In this case, the device is configured to include an analyzer that extracts the following syntax elements from the bit stream:
[00243] an indication of dependency which is an element of sin
57/102 rate that indicates for a slice in the slice header whether the variable length decoding of the slice depends or not on other slices;
[00244] an indicator that enables dependent slice included within a set of parameters for a plurality of slices and that indicates whether the dependent slices can be included in the bit stream or not; and [00245] a slice address that indicates the position within the bit stream at which the slice begins.
[00246] (2) In addition, in the present modality, the dependency indication is signaled within the slice header before the slice address and after a syntax element that identifies the set of parameters.
[00247] With this modality, it is possible, without causing analysis errors, to configure the dependency indication to be included in the bit stream only when the indicator that enables dependent slice indicates that dependent slices can be included in the bit stream.
[00248] (3) In the present modality, the indicator that enables dependent slice is located within the bit stream in a common set of parameters (PPS) for a plurality of slices that form the same picture frame and located at the beginning of the set of parameters. However, it is not limited to this.
[00249] Alternatively (or in addition), the indicator that enables dependent slice is located within the bit stream in a common set of parameters (SPS) for a plurality of slices that form the same sequence of figures. Alternatively (or in addition), the indicator that enables dependent slice is located within the bit stream in a common set of parameters (VPS) for a plurality of slices that form a plurality of picture frame sequences.
58/102 [00250] (4) In addition, in the present modality, the VPSJd and the
SPSJd can be flagged explicitly in a SEI message. When dependent__slice__enabled__flag is signaled in SPS, dependent_slice__flag must still follow the pic parameter set jd.
[00251] Otherwise, the analysis dependency is introduced due to the SPSJd being signaled in the PPS. With the signaling of the identification of the current SPS or VPS carrying the dependent_slice_enabled_flag, the dependency indication can also be included before the pic__parameter__setjd since, then, the analysis of the figuration parameter set is not necessary. In addition, such a SEI message that carries the VPSJd or SPSJd is not necessary for the decoding operation since these IDs are also determined by the analysis of the PPS. The SEI message can therefore be discarded without affecting the decoding operation after being used by the network elements.
(MODALITY 4) [00252] In Modality 4, the interfacial dependency information is duplicated (supplementary to the information signaled in the slice header and / or in a set of parameters) in another NAL unit as a SEI message.
[00253] For example, a SEI message can be defined that carries the interdependence dependency information in each access unit or before each dependent slice. The term access unit refers to a data unit that is manufactured from a set of NAL units. An access unit includes encoded picture slices, that is, NALUs VCL. In particular, access units can define points for random access and can include NALUs of a single figuration. However, the access unit is not necessarily a random access point.
[00254] In the current HEVC specifications, the access unit is
59/102 defined as a set of NAL units that are consecutive in decoding order and contain exactly one coded figuration. In addition to the coded slice NAL units of the coded picture, the access unit can also contain other NAL units that do not contain coded picture slices. Decoding an access unit always results in a decoded picture. However, in a future extension of HEVC (such as Multiple View Encoding, (MVC) or Scalable Video Encoding, (SVC)), the definition of the access unit can be relaxed or modified. According to the current specifications, the access unit is formed by an access unit delimiter, SEI messages and NALUs VCL.
[00255] According to the present modality, the dependency indication is located within the bit stream outside the header of a slice to which the dependency indication refers. In addition, it can be beneficial when the dependency indication is located within the bit stream in a supplementary enhancement information message included in the bit stream before the dependent slice or once per access unit.
(MODALITY 5) [00256] According to Modality 5, the interpathic dependency information is signaled in the NAL header as an indicator or implicitly as a type of NAL unit to which they are associated.
[00257] As a rule, the analysis of syntax elements in the NAL header does not depend on any other syntax elements. Each header of the NAL unit can be analyzed independently. The NAL header is the normal place to flag dependency information. Consequently, according to the present modality, the interpathies dependence is also signaled within the
60/102 same.
[00258] In other words, the analysis device can be adopted in a router or in a decoder. The analysis apparatus additionally includes a network adaptation layer unit to add a network adaptation layer and NAL header to a slice of encoded video data and to the slice header. Advantageously, the dependency indication is located within the bit stream in the NAL header and is coded independently of the other syntax elements.
[00259] The dependency indicator can be placed inside the NAL header since the NAL header in the current HEVC specifications includes some reserved bits that can be used for it. A single bit would be sufficient to signal the dependency indication.
[00260] Alternatively, the dependency indication is indicated by an NAL unit type and a predefined NAL unit type is reserved to carry dependency information.
(Mode 6) [00261] It is observed that the five modalities above can be combined arbitrarily to enable an effective analysis of the dependency information in the network elements. Even when their use is redundant, the modalities are combinable. Consequently, duplication of the dependency indication can be applied even when the dependency indication is also signaled at the beginning of the slice header.
[00262] Figure 14 shows an example of the NAL 1410 unit header in which the NAL 910 unit header shown in Figure 9A is modified. The header of the NAL 1410 unit includes dependent__slice__flag.
[00263] In addition, to move the dependent slice flag in the header
61/102 of NAL and to keep the size of the NAL header fixed due to backward compatibility, the required bit for the dependent slice flag is taken from the nuh__reserved__zero__6bits syntax element of the NAL unit header. Consequently, the nuh__reserved__zero__6bits syntax element now has only 5 bits. The nuh__reserved__zero__6bits syntax element includes bits reserved for future use so that the reduction does not cause any problems and does not require any additional modifications.
[00264] In general, a current NAL VCL unit depends on the previous NAL VCL unit that has the same temporal layerJd. When the dependent shce flag is signaled in the NAL header, a bit will be discarded for both NAL VCL and non-VCL units since each data unit as a picture slice or parameter set has the same NAL header. Consequently, although it appears that dependent__slice__flag would also be flagged for parameter sets or SEI messages, this is unnecessary. In addition, the dependent slice flag always needs to be flagged even if the dependent slices are disabled in the sequence parameter set. This leads to unnecessary overhead.
[00265] In all of the above modes, the dependency indication can be a one-bit indicator.
(MODALITY 7) [00266] According to Mode 7, the dependency indication is indicated by a type of NAL unit and a predefined type of NAL unit is reserved to carry dependency information.
[00267] Consequently, a new (separate) type of NAL VCL is defined with a semiotics similar to the existing NAL VCL units. For example, when NAL__unit__type is equal to 15 (or to another predefined type or NALU that is not reserved for another particular type of
62/102
NALU), then the current NAL VCL unit depends on the previous NAL VCL unit that has the same temporaljayerjd. Dependency refers to the dependency of the current slice on the slice header of a preceding slice, as described above, that is, dependency under analysis.
[00268] It may be advantageous in these cases to include the bit in the NAL header to the additional NAL unit types. This can be used to indicate whether a current slice is a dependent slice or not.
[00269] When the dependency information is signaled in the slice header in addition to the NAL header, the signaling in the NAL header becomes optional. Specifically, when the NAL unit type field in the NAL header is set to signal that the current slice is a dependent slice, then it is not possible to signal any other type information. For example, in some cases, it may be more beneficial to carry information that a current slice is a first figure in the sequence (NAL_unit_type equal to 10 or 11). When interfacial dependency information in the NAL header is optional (since it is duplicated in the slice header), it can be chosen to signal the most valuable information.
[00270] In addition, it may be advantageous to add two or more types of NAL VCL units as dependent slice RAP figure (required for analysis) or dependent slice non-RAP figure. RAP denotes the figure of random access. A Random Access Figure is a figure encoded independently (in terms of prediction) from other figures so that it can be used as a starting point for encoding and decoding. Thus, it is suitable, as well, as a random access point.
[00271] In the dependent slice header, the syntax element RapPicFlag is used in the analysis process. Specifically, he
63/102 RapPicFlag syntax is an indication of whether the current figure is a random access figure or not.
[00272] The value of the RAPPicFlag depends on the type of NAL unit as the following Expression 2.
[Equation 2]
RapPicFlag ~ (nal __ unit _ type> 7 & & nal _ unit _type <12) (Expression 2) [00273] In other words, in the example shown in Figure 15, random access figures are carried by NALUs with NALU type between 7 and 12. To enable correct analysis and to provide a slice dependency possibility for random access figures, in this way, in the present invention, two different types of NAL units are defined to ensure a correct slice header analysis .
[00274] As a general rule, even when a new type of NAL VCL unit is defined, the slice header analysis should still be possible without any problem. Either of the multiple types of NAL is defined as above or the dependent slice header is changed in such a way that there is no problem in the analysis. [00275] When a new type of NAL VCL unit is defined to indicate the dependent slice, the slice header syntax structure can be changed as follows.
[00276] In the example above, the NAL DS NUT unit type is used to indicate that the current NAL VCL unit is a dependent slice. In comparison to the state-of-the-art slice header syntax structure that is described in Non-Patent Literature 3, the following two changes are introduced in the present embodiment.
[00277] (1) no__output__of__prior__pics__flag is not flagged in the dependent slice header. In other words, the presence of no output of prior pics flag is based on the condition that the slice
Current 64/102 is not a dependent slice. (no__output_of__prior_pics_flag can be present in the slice header when the current slice is not a dependent slice).
[00278] (2) first sHce Jn pic fiag is conditionally flagged in the value of nal_unit_type. When the value of the nal unit type indicates that the current slice is a dependent slice, the syntax element fírst siice in pic flag is not explicitly flagged and is inferred to be 0. This saves a bit rate in the same quality.
[00279] According to the example, the no__output__of__prior__pics__flag is not signaled when the current slice is a dependent slice. Consequently, the RapPicFlag value does not need to be evaluated when the current slice is a dependent slice. Thus, the slice header of a dependent slice can be analyzed without a problem. More specifically, the slice header of the dependent slice can be analyzed without referring to the NAL unit header of a preceding NAL unit header. A problem occurs when the preceding NAL unit header is not present at the time of decoding.
[00280] Second, the first slicejn pic flag is signaled based on the value of the NAL unit type. This change is the same as that of the example described in Figure 12. In Figure 12, the first slicejn pic flag is flagged in the slice header only when the current slice is not a dependent slice (which is indicated by dependent_slice__flag). Similarly in the example above, a first slicejn pic flag is signaled only when the nal unit type is not equal to DS__NUT, which means that the current slice is not a dependent slice.
[00281] The two changes that are presented above do not need to be made together. It is also possible to make only one of the changes to the slice header. The benefit of each change is the
65/102 added to the cost to verify whether a slice is a dependent slice or not. However, when the two changes are made together, the benefits of both changes can both come from the same costs as the benefit of each of the individual changes in the case where the two syntax elements firstshceJnjjicfiag and no__output__of__prior__pics__flag are coded consecutively. Thus, the application of both changes in combination with consecutive coding of the two mentioned syntax elements gives an advantage over the direct application of each of the changes individually.
[00282] Throughout the explanation in the modalities, it is also possible to remove a dependent_slice_enabledJ1ag from the bit stream when the dependent slice indication is not conditionally coded in it. In other words, when, for example, a new type of NAL unit is used to indicate that the current slice is a dependent slice, then the dependent slice enabled flag can be removed from the bit stream.
[00283] Figure 15 shows a header of the NAL 1510 unit that is the same as the header of the NAL 910 unit shown in Figure 9A and a slice header 1520 that is changed from the slice header 920 shown in Figure 9A. The slice header 1520 includes the termination of the dependent slice flag value according to the type of NALU. In particular, an NAL__unit__type syntax element with values 15 and 16 defines dependent slices. When NAL unit type is equal to 15, the slice type is a random access-dependent slice. If, on the other hand, the NAL__unit__type is equal to 16, the slice is a slice dependent on a non-random access figure. Thus, a relationship of the following Expression 3 is established.
[Equation 31
66/102
RapPicFlag = (nal __ unit _ type> 7 & & nal _ unit __ type <12 | nal__ unit __ type == 15) (Expression 3) [00284] Note that values 15 and 16 were selected only as one example. As is evident to those skilled in the art, any predefined numbers can be adopted that are not used otherwise. Specifically, a first type of NALU must be defined to identify a slice content dependent on a random access figure and a second type of a NALU must be defined to identify a slice content dependent on a non-random access figure.
[00285] In addition, a restriction can be applied in which the dependent slices are used only for RAPs or used only for non-RAPs. In such cases, only a new type of NALU is needed.
(MODALITY 8) [00286] Figure 16 is a diagram showing an alternative solution. A header of the NAL 1610 unit is the same as the header of the NAL 910 unit. The slice header 1620 assumes the definition of NAL unit type with values 15 and 16 that signal dependent slices as described above.
[00287] However, the NAL unit type is not used in the analysis of the dependent slice indicator. This enables the use of the NAL unit type to be optional for the encoder. Consequently, the advantage of the present modality is obtained only when the encoder is determined to adopt the new types of NALU.
[00288] So, the router only needs to look for the NALU type. However, when the encoder does not use the new NALU types, the router would treat the dependent slices as in the prior art.
[00289] In summary, the dependency indication can be indicated by a type of NAL unit. A predefined NAL unit type can
67/102 must be reserved to bear coded slices, the slice header on which depends on the slice header of a preceding slice. Advantageously, a separate type of NAL unit that indicates the dependency is provided for random access figures and for non-random access figures.
[00290] In summary, the modalities described above refer to a syntax of a bit stream that carries encoded video sequences. In particular, the modalities described above refer to a syntax related to entropy and dependent slices, on which the slice header depends on the slice header of a preceding slice. To allow a network element that recognizes the media to consider this type of dependency without essentially increasing its complexity and length due to analysis, the indication of dependency is signaled at the beginning of the packets or in other words in the vicinity of the headers or parameters that must be analyzed. This is achieved, for example, by including the dependency indication at the beginning of the slice header (Figures 10 to 12), possibly after the parameter set identifier and before the slice address or by including the dependency indication before the address slice (Figures 10 and 11) or by providing the dependency indication in a NALU header (Figure 14), in a separate message or by a special type of NALU for NALUs that carry dependent slices (Figures 15 and 16).
(MODIFICATIONS OF MODALITIES 1 TO 8, EFFECT AND SIMILAR) [00291] Several changes are possible without being limited by Modalities 1 to 8 and are obviously included in the scope of the present invention.
[00292] Each of the structural elements in each of the modalities described above can be configured in the form of an exclusive hardware product (processing circuit) or can be
68/102 performed by executing a software program suitable for the structural element. Each of the structural elements can be realized by means of a program execution unit such as a CPU and a processor, reading and executing the software program recorded on a recording medium such as a hard disk or a semiconductor memory.
[00293] Although in Modes 1 to 8, the description assumes a wavefront, it is not limited to it.
[00294] However, in the case of a wavefront, all subcurrents cannot be started at the same time. As described above, for each of the subcurrents except for the subcurrent at the beginning, the start of processing (encoding or decoding) is delayed by two LCUs from the preceding subcurrent. Thus, on the wavefront, further shortening of processing is necessary. In the present modality, by locating the dependency indication (dependent slice flag) after the syntax that identifies PPS and before the slice address, the number of syntax elements to be analyzed can be reduced and, thus, the processing is reduced.
[00295] In addition, in the modalities described above 1 to 8, by placing the dependency indication upwards within the slice header (particularly at the beginning) it is possible, for example, to verify whether each slice is a dependent slice or not. in a first stage of figuration processing.
[00296] In other words, at the moment of the beginning of the processing in a figuration (encoding or decoding), when a step verifies that each of the slices is the dependent slice, it is possible to extract a starting point of the parallel processing at the moment of the beginning of the processing in figuration. In other words, when the figure includes a plurality of normal slices, it is possible to extract one
69/102 starting point of parallel processing at the time of processing in a figuration or in a first stage of processing.
[00297] Here, conventionally, when the dependency indication is displayed after the slice address, it is not possible to check whether the slice is a dependent slice or a normal slice until the slice address analysis is finished. In this case, the start of processing on the normal slice in the middle of the picture is significantly delayed from the start of processing on the normal slice at the start of the picture.
[00298] Conversely, in the modalities described above 1 to 8, since it is possible to verify whether each of the slices is a dependent slice or not at an early stage of processing in a figuration, it is possible to promote the beginning of processing in a slice normal in the middle of the picture. In other words, it is possible to start processing on the normal slice in the middle of a picture at the same time as the normal slice at the beginning of the picture.
(Mode 9) [00299] The processing described in each of the modalities can be implemented in a simple way in an independent computer system by recording, in a recording medium, a program to implement the settings of the figuration encoding method in movement (image encoding method) and the motion figuration decoding method (image decoding method) described in each of the modalities. The recording medium can be any recording medium as long as the program can be recorded as a magnetic disk, an optical disk, a magnetic optical disk, a Cl card and a semiconductor memory.
[00300] From now on, applications for the motion picture encoding method (image encoding method) and the motion picture decoding method (motion method)
70/102 image decoding) described in each of the modalities and systems using them will be described. The system has a feature of having an image encoding and decoding apparatus that includes an image encoding apparatus that uses the image encoding method and an image decoding apparatus that uses the image decoding method. Other settings in the system can be changed as appropriate depending on the case.
[00301] Figure 17 illustrates a general configuration of a system that provides ex100 content for deploying content distribution services. The area for providing communication services is divided into cells of the desired size and base stations ex106, ex107, ex108, ex109 and ex110 which are fixed wireless stations are placed in each of the cells.
[00302] The system that provides ex100 content is connected to devices such as an ex111 computer, an ex112 personal digital assistant (PDA), an ex113 camera, an ex114 cell phone and an ex115 gaming machine, via the ex101 Internet, a provider of Internet services ex102, a telephone network ex104, as well as base stations ex106 to ex110, respectively.
[00303] However, the system configuration that provides ex100 content is not limited to the configuration shown in Figure 17 and a combination in which any of the elements are connected is acceptable. In addition, each device can be connected directly to the ex104 telephone network, rather than through the base stations ex106 to ex110 which are the fixed wireless stations. In addition, the devices can be interconnected with each other by short distance and other wireless communication.
[00304] The camera ex113, like a digital video camera, can capture video. An ex116 camera, like a digital camera, can
71/102 capture both still images and video. In addition, the ex114 cell phone may be one that meets any of the standards such as Global System for Mobile Communications (GSM) (registered trademark), Code Division Multiple Access (CDMA), Band Code Division Multiple Access Broadband (W-CDMA), Long Term Evolution (LTE) and High Speed Packet Access (HSPA). Alternatively, the cell phone ex114 can be a Personal Portable Telephone System (PHS).
[00305] In the system that provides content ex100, an ex103 streaming server is connected to the camera ex113 and others via the telephone network ex104 and the base station ex109, which allows the distribution of images from a live show and others. In such a distribution, content (for example, video from a live music show) captured by the user using the camera ex113 is encoded as described above in each of the modalities (that is, the camera functions as the video encoding device). image in accordance with an aspect of the present invention) and the encoded content is transmitted to the streaming server ex103. On the other hand, the streaming server ex103 performs current distribution of the content data transmitted to the clients after their requests. Customers include the computer ex111, the PDA ex112, a camera ex113, the cell phone ex114 and the game machine ex115 which can decode the encrypted data mentioned above. Each of the devices that received the distributed data decodes and reproduces the encoded data (i.e., it functions as the image decoding apparatus according to an aspect of the present invention).
[00306] The captured data can be encoded by the camera ex113 or by the streaming server ex103 that transmits the data or the encoding processes can be shared
72/102 between the camera ex113 and the streaming server ex103. Similarly, the distributed data can be decoded by the clients or by the streaming server ex103 or the decoding processes can be shared between the clients and the streaming server ex103. In addition, still image and video data captured not only by the camera ex113, but also by the camera ex116, can be transmitted to the streaming server ex103 through the computer ex111. The encoding processes can be performed by the camera ex116, the computer ex111 or by the streaming server ex103 or shared by them.
[00307] In addition, the encoding and decoding processes can be carried out by a LSI exõOO included in general in each of the computer ex111 and the devices. The LSI ex500 can be configured with a single chip or a plurality of chips. Video encoding and decoding software can be integrated into some type of recording medium (such as a CD-ROM, a floppy disk and a hard disk) that is readable by the computer ex111 and others and the encoding and decoding processes can be performed using the software. In addition, when the cell phone ex114 is equipped with a camera, the video data obtained by the camera can be transmitted. The video data is data encoded by the LSI ex500 included in the cell phone ex114.
[00308] In addition, the streaming server ex103 can be composed of servers and computers and can decentralize data and process the decentralized data, save or distribute data. [00309] As described above, customers can receive and reproduce data encoded in the system that provides ex100 content. In other words, customers can receive and decode information transmitted by the user and reproduce the decoded data
73/102 in real time on the system that provides ex100 content, so that the user who has no particular rights and the equipment can deploy a personal broadcast.
[00310] In addition to the example of the system that provides ex100 content, at least one of the mobile figuration encoding apparatus (image encoding apparatus) and the mobile figuration decoding apparatus (image decoding apparatus) described in each modalities can be deployed in an ex200 digital broadcasting system illustrated in Figure 18. More specifically, an ex201 broadcasting station communicates or transmits, via radio waves to an ex202 broadcasting satellite, the multiplexed data obtained by multiplexing data from audio and others in the video data. The video data is data encoded by the motion picture encoding method described in each of the embodiments (i.e., data encoded by the image encoding apparatus in accordance with an aspect of the present invention). After receiving the multiplexed data, the broadcast satellite ex202 transmits radio waves for broadcast. Then, an ex204 home antenna with a satellite broadcast reception function receives the radio waves. Next, a device such as a television (receiver) ex300 and a converter box (STB) ex217 decodes the received multiplexed data and reproduces the decoded data (that is, it functions as the image decoding apparatus according to an aspect of the present invention. ).
[00311] In addition, an ex218 reader / writer (i) reads and decodes the multiplexed data recorded on an ex215 recording medium such as a DVD and a BD or (i) encodes video signals on the ex215 recording medium and, in some cases, write data obtained by multiplexing an audio signal in the encoded data. The ex218 reader / writer can include the mobile picture decoding device or the device
74/102 mobile figure coding coupling as shown in each of the modalities. In this case, the reproduced video signals are displayed on the monitor ex219 and can be reproduced by another device or system that uses the recording medium ex215 in which the multiplexed data is recorded. It is also possible to implant the mobile figure decoding device in the ex217 converter box connected to the ex203 cable for a cable television or to the ex204 antenna for satellite and / or terrestrial broadcasting in order to display the video signals on the ex219 monitor of the ex300 television . The mobile picture decoding device cannot be installed in the converter box, but on the ex300 television.
[00312] Figure 19 illustrates the television (receiver) ex300 that uses the motion picture encoding method and the motion picture decoding method described in each of the modalities. The ex300 television includes: an ex301 tuner that obtains or provides multiplexed data obtained by multiplexing audio data into video data via the ex204 antenna or the ex203 cable, etc. that receives a broadcast; a modulation / demodulation unit ex302 that demodulates received multiplexed data or modulates data into multiplexed data to be supplied out; and a multiplexing / demultiplexing unit ex303 that demultplexes multiplexed data modulated into video data and audio data or multiplexes video data and audio data encoded by an ex306 signal processing unit into data.
[00313] The ex300 television additionally includes: an ex306 signal processing unit which includes an ex304 audio signal processing unit and an ex305 video signal processing unit that decodes audio data and video data and encodes data from audio and video data, respectively (which functions as the image encoding device and the
75/102 image decoding according to the aspects of the present invention); and an ex309 output unit that includes an ex307 speaker that provides the decoded audio signal and an ex308 display unit that displays the decoded video signal as a display. In addition, the ex300 television includes an ex317 interface unit that includes an ex312 operation input unit that receives input from a user operation. In addition, the ex300 television includes an ex310 control unit that generally controls each constituent element of the ex300 television and an ex311 power supply circuit unit that supplies power to each of the elements. In addition to the operation input unit ex312, the interface unit ex317 may include: a bridge ex313 that is connected to an external device such as the reader / writer ex218; a slot unit ex314 to allow the attachment of the recording medium ex216 as an SD card; an ex315 driver to be connected to an external recording medium such as a hard drive; and an ex316 modem to be connected to a telephone network. Here, the ex216 recording medium can electrically record information using a volatile / non-volatile memory semiconductor memory element for storage. The constituent elements of the ex300 television are connected to each other via a synchronous bus.
[00314] First, the configuration in which the ex300 television decodes multiplexed data obtained from outside via the ex204 antenna and others and reproduces the decoded data will be described. On the ex300 television, after a user operation via a remote controller ex220 and others, the multiplexing / demultiplexing unit ex303 demultiplexes the demodulated multiplexed data by the modulation / demodulation unit ex302 under the control of the ex310 control unit which includes a CPU. In addition, the ex304 audio signal processing unit decodes demul audio data
76/102 tiplexed and the video signal processing unit ex305 decodes the demultiplexed video data using the decoding method described in each of the modalities on the ex300 television. The output unit ex309 provides the decoded video signal and an external audio signal, respectively. When the output unit ex309 provides the video signal and the audio signal, the signals can be temporarily stored in temporary storage ex318 and ex319 and others so that the signals are reproduced in synchronization with each other. In addition, the ex300 television can read multiplexed data not through a broadcast and other, but from the recording media ex215 and ex216 such as a magnetic disk, an optical disk and an SD card. In the following, a configuration in which the ex300 television encodes an audio signal and a video signal and transmits the external data or writes the data to a recording medium will be described. On the ex300 television, after a user operation by the remote controller ex220 and others, the audio signal processing unit ex304 encodes an audio signal and the video signal processing unit ex305 encodes a video signal under the control of the unit control system ex310 that uses the coding method described in each of the modalities. The multiplexing / demultiplexing unit ex303 multiplexes the encoded video signal and the audio signal and supplies the resulting signal to the outside. When the multiplexing / demultiplexing unit ex303 multiplexes the video signal and the audio signal, the signals can be stored temporarily in the temporary storage ex320 and ex321 and others so that the signals are reproduced in synchronization with each other. Here, the temporary stores ex318, ex319, ex320 and ex321 can be in the plural as illustrated or at least one temporary store can be shared on the television ex300. In addition, data can be stored in a storage
77/102 so that overload and underload of the system can be avoided between the modulation / demodulation unit ex302 and the multiplexing / demultiplexing unit ex303, for example.
[00315] In addition, the ex300 television can include a setting to receive AV input from a microphone or a camera other than the setting to obtain audio and video data from a broadcast or recording medium and can encode the data obtained . Although the ex300 television can encode, multiplex and provide external data in the description, it may only be able to receive, decode and provide external data, but not the encoding, multiplexing and provision of external data.
[00316] In addition, when the reader / writer ex218 reads or writes multiplexed data from or on a recording medium, one of the ex300 television and the reader / writer ex218 can decode or encode the multiplexed data and the television ex300 and the reader / ex218 recorder can share decoding or encoding.
[00317] As an example, Figure 20 illustrates a configuration of an ex400 information playback / recording unit when data is read from or written to or on an optical disc. The information reproduction / recording unit ex400 includes constituent elements ex401, ex402, ex403, ex404, ex405, ex406 and ex407 to be described hereinafter. The ex401 optical head radiates a laser point on a recording surface of the ex215 recording medium which is an optical disc for writing information and detects reflected light from the recording surface of the ex215 recording medium to read the information. The ex402 modulation recording unit electrically drives a semiconductor laser included in the ex401 optical head and modulates the laser light according to the recorded data. The reproduction demodulation unit ex403 amplifies a reproduction signal obtained by the electrical detection of reflected light
78/102 of the recording surface with the use of a photodetector included in the optical head ex401 and demodulates the reproduction signal by separating a signal component recorded in the recording medium ex215 to reproduce the necessary information. The temporary storage ex404 temporarily retains the information to be recorded on the recording medium ex215 and the information reproduced from the recording medium ex215. The disc motor ex405 rotates the recording medium ex215. The ex406 servo control unit moves the ex401 optical head to a predetermined information track while controlling the ex405 disc motor's rotation drive to follow the laser point. The ex407 system control unit generally controls the ex400 information playback / recording unit. The reading and writing processes can be implemented by the ex407 system control unit using various information stored in the ex404 temporary storage and by generating and adding new information as needed and by the modulation recording unit ex402, by the demodulation unit reproduction ex403 and servo control unit ex406 which record and reproduce information through the optical head ex401 while being operated in a coordinated manner. The ex407 system control unit includes, for example, a microprocessor and performs processing by having a computer run a program for reading and writing.
[00318] Although the optical head ex401 radiates a laser point in the description, it can perform high density recording with the use of light by proximity field.
[00319] Figure 21 illustrates the recording medium ex215 which is the optical disc. On the recording surface of the ex215 recording medium, guide slots are formed in a spiral and an ex230 information track records, in advance, address information indicating
79/102 an absolute position on the disc according to a change in the shape of the guide grooves. The address information includes information for determining positions of recording blocks ex231 which are a unit for recording data. Playing the ex230 information track and reading the address information on a device that records and reproduces data can lead to the determination of the positions of the recording blocks. In addition, the recording medium ex215 includes a data recording area ex233, an internal circumference area ex232 and an external circumference area ex234. The data recording area ex233 is an area for use in recording user data. The inner circumference area ex232 and the outer circumference area ex234 that are inside and outside the data recording area ex233, respectively, are for specific use, except for the recording of user data. The information playback / recording unit 400 reads and writes encoded audio, encoded video data or multiplexed data obtained by multiplexing the encoded audio and video data from and in the data recording area ex233 of the recording medium ex215.
[00320] Although an optical disc that has a layer, such as a DVD and a BD, is described as an example in the description, the optical disc is not limited to such and may be an optical disc that has a multilayer structure and that can be engraved on a different part of the surface. In addition, the optical disc can have a structure for multidimensional recording / reproduction such as recording information using color light with different wavelengths in the same portion of the optical disc and recording information that has different layers from various angles.
[00321] In addition, an ex210 car that has an ex205 antenna can receive data from the ex202 satellite and others and play video on a display device such as a car navigation system
80/102 ex211 adjusted in the ex210 car in the ex200 digital broadcasting system. Here, an ex211 car navigation system configuration will be a configuration, for example, that includes a GPS receiving unit from the configuration illustrated in Figure 19. The same will be true for the configuration of the computer ex111, the cell phone ex114 and others .
[00322] Figure 22A illustrates the cell phone ex114 that uses the motion picture encoding method and the motion picture decoding method, described in the modalities. The cell phone ex114 includes: an antenna ex350 for transmitting and receiving radio waves through the base station ex110; an ex365 camera unit that can capture moving and still images; and an ex358 display unit as a liquid crystal display to display data as captured video decoded by the ex365 camera unit or received by the ex350 antenna. The cell phone ex114 additionally includes: a main body unit that includes an operating key unit ex366; an ex357 audio output unit as a speaker for audio output; an ex356 audio input unit as a microphone for audio input; an ex367 memory unit for storing captured video or still pictures, recorded audio, encoded or decoded data from received video, still pictures, e-mails or others; and an ex364 slot unit which is an interface unit for a recording medium that stores data in the same way as the ex367 memory unit.
[00323] Below, an example of a cell phone configuration ex114 will be described with reference to Figure 22B. On the ex114 cell phone, an ex360 main control unit designed to generally control each main body unit that includes the ex358 display unit as well as the operating switch unit.
81/102 ex366 is mutually connected, via an ex370 synchronous bus, to an ex361 power supply circuit unit, an ex362 operation input control unit, an ex355 video signal processing unit, an ex355 video signal processing unit ex363 camera interface, ex359 liquid crystal display (LCD) control unit, ex352 modulation / demodulation unit, ex353 multiplexing / demultiplexing unit, ex354 audio signal processing unit, ex364 slot unit and the memory unit ex367.
[00324] When an end-of-call key or a power switch is switched ON by user operation, the ex361 power supply circuit unit supplies the respective units with power from a rechargeable battery in order to activate the cell phone ex114.
[00325] On the cell phone ex114, the audio signal processing unit ex354 converts the audio signals collected by the audio input unit ex356 in voice talk mode to digital audio signals under the control of the main control unit ex360 which includes a CPU, a ROM and a RAM. Then, the ex352 modulation / demodulation unit performs distributed spectrum processing on digital audio signals and the ex351 transmit and receive unit performs digital to analog conversion and frequency conversion on the data in order to transmit the resulting data through the ex350 antenna. . In addition, on the ex114 cell phone, the ex351 transmit and receive unit amplifies the data received by the ex350 antenna in voice talk mode and performs frequency conversion and analog to digital conversion on the data. Then, the ex352 modulation / demodulation unit performs reverse distributed spectrum processing on the data and the ex354 audio signal processing unit converts them into analog audio signals in order to output them via the output unit of audio ex357.
[00326] In addition, when an e-mail in data communication mode is transmitted, text data from the e-mail inserted by the operation of the ex366 operating key unit and others from the main body are sent out of the control unit main ex360 via the operation input control unit ex362. The main control unit ex360 causes the modulation / demodulation unit ex352 to perform distributed spectrum processing in the text data and the transmission and reception unit ex351 performs the conversion from digital to analog and the frequency conversion in the resulting data to transmit the data to the ex110 base station via the ex35d antenna When an email is received, a processing that is approximately the reverse of the processing to transmit an email is performed on the received data and the resulting data is provided to the display unit ex358 .
[00327] When video, still images or video and audio in data communication mode is or is transmitted, the ex355 video signal processing unit compresses and encodes video signals supplied from the ex365 camera unit using the encoding method in motion picture shown in each of the modalities (i.e., functions as the image encoding apparatus according to the aspect of the present invention) and transmits the encoded video data to the multiplexing / demultiplexing unit ex353. In contrast, during the time when the ex365 camera unit captures video, still images and the like, the ex354 audio signal processing unit encodes audio signals collected by the ex356 audio input unit and transmits the encoded audio data to the multiplexing / demultiplexing unit ex353.
[00328] The ex353 multi83 / 102 multiplexing / demultiplexing unit plexes the encoded video data supplied from the ex355 video signal processing unit and the encoded audio data supplied from the ex354 audio signal processing unit using a predetermined method. Then, the modulation / demodulation unit (modulation / demodulation circuit unit) ex352 performs distributed spectrum processing on the multiplexed data and the ex351 transmit and receive unit performs a digital to analog conversion and frequency conversion on the mode data. transmitting the resulting data through the ex350 antenna.
[00329] When receiving data from a video file that is linked to a web page and others in data communication mode or when receiving an email with video and / or audio attached to decode the multiplexed data received through the antenna ex350, the multiplexing / demultiplexing unit ex353 demultiplexes the multiplexed data into a video data bit stream and an audio data bit stream and supplies the video signal processing unit ex355 with the encoded video data and the unit of audio signal processing ex354 the audio data encoded via the synchronous bus ex370. The video signal processing unit ex355 decodes the video signal using a motion picture decoding method that corresponds to the motion picture encoding method shown in each of the modes (ie it works as the device image decoding according to the aspect of the present invention) and then the display unit ex358 displays, for example, the video and still images included in the video file linked to the web page via the LCD control unit ex359. In addition, the ex354 audio signal processing unit decodes the audio signal and the ex357 audio output unit provides
84/102 the audio.
[00330] Furthermore, similarly to the ex300 television, it is possible for a terminal such as the cell phone ex114 to have 3 types of implantation configurations that include not only (i) a transmission and reception terminal that includes both a encoding as a decoding apparatus, but also (ii) a transmitting terminal that includes only a coding apparatus and (iii) a receiving terminal that includes only a decoding apparatus. Although the digital broadcasting system ex200 receives and transmits the multiplexed data obtained by multiplexing audio data into video data in the description, the multiplexed data can be data obtained by multiplexing not audio data, but character data related to the video in video data and may not be multiplexed data, but data from the video itself.
[00331] Thus, the method of encoding motion picture and the method of decoding motion picture in each of the modalities can be used in any of the described devices and systems. Thus, the advantages described in each of the modalities can be obtained.
[00332] Furthermore, the present invention is not limited to the modalities and various modifications and revisions are possible without departing from the scope of the present invention.
(MODE 10) [00333] Video data can be generated by switching, as necessary, between (i) the motion picture encoding method or the mobile picture encoding device shown in each of the modes and (ii) a motion picture encoding method or a mobile picture encoding device conforming to a different standard such as MPEG-2, MPEG-4 AVCe VC-1.
85/102 [00334] Here, when a plurality of video data that conforms to different standards is generated and is then decoded, decoding methods need to be selected to conform to different standards. However, since the standard to which each of the plurality of video data to be decoded conforms cannot be detected, there is a problem that an appropriate decoding method cannot be selected.
[00335] To solve the problem, multiplexed data obtained by multiplexing audio and other data into video data has a structure that includes identifying information that indicates which standard the video data conforms to. The specific structure of the multiplexed data that includes the video data generated in the motion picture encoding method and by the mobile picture encoding apparatus shown in each of the modalities will be described hereinafter. The multiplexed data is a digital stream in the MPEG-2 Transport Stream format.
[00336] Figure 23 illustrates a multiplexed data structure. As illustrated in Figure 23, multiplexed data can be obtained by multiplexing at least one of a video stream, an audio stream, a presentation graphics stream (PG) and an interactive graphics stream. The video stream represents a primary video and a secondary movie video, the audio stream (IG) represents a primary audio part and a secondary audio part to be mixed with the primary audio part and the graphics graphics stream. presentation represents subtitles of the film. Here, the primary video is a normal video to be displayed on a screen and the secondary video is a video to be displayed in a smaller window on the primary video. In addition, the stream of interactive graphics represents an interactive screen to be generated by
86/102 GUI components on one screen. The video stream is encoded in the motion picture encoding method or by the mobile picture encoding device shown in each of the modalities or in a motion picture encoding method or by a mobile picture encoding device in accordance with a conventional standard like MPEG-2, MPEG-4 AVC and VC-
1. The audio stream is encoded according to a standard such as Dolby-AC-3, Dolby Digital Plus, MLP, DTS, DTS-HD, and linear PGM.
[00337] Each current included in the multiplexed data is identified by PID. For example, 0x1011 is allocated for the video stream to be used for movie video, 0x1100 to 0x111F is allocated for audio streams, 0x1200 to 0x121F is allocated for presentation graphics streams, 0x1400 to 0x141 F is allocated for the interactive graphics streams, 0x1 B00 to 0x1 B1F are allocated for the video streams to be used for secondary movie video and 0x1 A00 to 0x1 A1F are allocated for the audio streams to be used for the secondary audio to be mixed with the primary audio.
[00338] Figure 24 illustrates schematically how the data is multiplexed. First, a video stream ex235 composed of video frames and an audio stream ex238 composed of audio frames are transformed into a stream of PES packets ex236 and a stream of PES packets ex239 and additionally into TS packets ex237 and TS packets ex240, respectively. Similarly, data from a stream of presentation graphics ex241 and data from a stream of interactive graphics ex244 are transformed into a stream of PES packets ex242 and a stream of PES packets ex245 and additionally into TS packets ex243 and TS packets ex246, respectively. . These TS packets are multiplexed in a chain to obtain multiplexed data ex247.
87/102 [00339] Figure 25 illustrates how a video stream is stored in a PES packet stream in more detail. The first bar in Figure 25 shows a stream of video frames in a stream of video. The second bar shows the chain of PES packets. As indicated by arrows denoted as yy1, yy2, yy3 and yy4 in Figure 25, the video stream is divided into pictures as pictures I, pictures B and pictures P, each of which is a video display unit and the pictures are stored in a payload of each of the PES packages. Each PES packet has a PES header and the PES header stores a Display Time Mark (PTS) which indicates a picture display time and a Decode Time Mark (DTS) which indicates a decode time for the picture. figuration.
[00340] Figure 26 illustrates a TS packet format to be finally written in the multiplexed data. Each TS packet is a 188-byte fixed-length packet that includes a 4-byte TS header that has information like a PID to identify a current and a 184-byte TS payload to store data. PES packages are divided and stored in TS payloads, respectively. When a BD ROM is used, each TS packet is given a 4-byte TP „Extra _Header, resulting in 192 byte source packets. The source packets are written to the multiplexed data. TP__Extra__Header stores information as an Arrival _Time „Stamp (ATS). ATS shows a transfer initiation time when each TS packet must be transferred to a PID filter. The source packets are arranged in the multiplexed data as shown at the bottom of Figure 26. The numbers that increment from the header of the multiplexed data are called source packet numbers (SPNs).
[00341] Each of the TS packets included in the multiplex data
88/102 dos includes not only streams of audio, video, subtitles and others, but also a Program Association Table (PAT), a Program Map Table (PMT) and a Program Clock Reference (PCR). PAT shows which PID in a PMT used in the multiplexed data is indicated and a PID of the PAT itself is registered as zero. The PMT stores PIDs of the video, audio, subtitles and other streams included in the multiplexed data and assigns information from the streams corresponding to the PIDs. PMT also has several descriptors related to multiplexed data. Descriptors have information such as copy control information that shows whether copying of multiplexed data is allowed or not. The PCR stores STC time information that corresponds to an ATS that shows when the PCR packet is transferred to a decoder to obtain synchronization between an Arrival Time Clock (ATC) which is an ATS time axis and a Time Clock System (STC) which is a time axis of PTSs and DTSs.
[00342] Figure 27 illustrates the PMT data structure in detail. A PMT header is placed at the top of the PMT. The PMT header describes the length of data included in the PMT and others. A plurality of descriptors related to the multiplexed data is arranged after the PMT header. Information such as copy control information is described in the descriptors. After the descriptors, a plurality of pieces of current information that refer to the currents included in the multiplexed data is arranged. Each piece of current information includes current descriptors that each describe information as a type of current to identify a current compression codec, a current PID and current attribute information (such as a frame rate or a aspect ratio). The current descriptors are equal in number to the number of currents in the multiplexed data.
89/102 [00343] When multiplexed data is recorded on a recording medium and others, it is recorded along with multiplexed data information files.
[00344] Each of the multiplexed data information files is multiplexed data management information as shown in Figure 28. The multiplexed data information files are matched one by one with the multiplexed data and each of the files includes information multiplexed data, current attribute information and an insertion map.
[00345] As shown in Figure 28, multiplexed data information includes a system rate, a time to start playback and an time to end playback. The system rate indicates the maximum transfer rate at which a target decoder of the system to be described then transfers the multiplexed data to a PID Filter. The ATS intervals included in the multiplexed data are defined to be no greater than the system rate. The start time of reproduction indicates a PTS in a video frame in the header of the multiplexed data. A frame interval is added to a PTS in a video frame at the end of the multiplexed data and the PTS is defined at the end of playback.
[00346] As shown in Figure 29, a piece of attribute information is recorded in the current attribute information for each PID of each current included in the multiplexed data. Each piece of attribute information has different information depending on whether the corresponding stream is a video stream, an audio stream, a presentation graphics stream or an interactive graphics stream. Each piece of video stream attribute information carries information that includes what type
90/102 of compression codec is used to compress the video stream and resolution, an aspect ratio and a frame rate of the pieces of picture data that are included in the video stream. Each piece of audio stream attribute information carries information that includes what type of compression codec should be used to compress the audio stream, how many channels are included in the audio stream, what language the audio stream supports, and how high is the sampling frequency. The video stream attribute information and the audio stream attribute information are used to initialize a decoder before the player plays the information.
[00347] In the present modality, the multiplexed data to be used are of a current type included in the PMT. In addition, when multiplexed data is recorded on a recording medium, the video stream attribute information included in the multiplexed data information is used. More specifically, the motion picture encoding method or the mobile picture encoding apparatus described in each of the modalities includes a step or unit for allocating unique information that indicates video data generated by the motion picture encoding method or by the mobile figure encoding device in each of the modalities to the type of current included in the PMT or to the video current attribute information. With the configuration, the video data generated by the motion picture encoding method or the mobile picture encoding apparatus described in each of the modalities can be distinguished from video data that conform to another standard.
[00348] In addition, Figure 30 illustrates steps of the method of decoding figuration in motion according to the present modality. In Step exS100, the type of current included in the PMT or the
91/102 video stream attribute formations included in the multiplexed data information are obtained from the multiplexed data. Then, in Step exS101, it is determined whether the current type or the video current attribute information indicates whether or not the multiplexed data is generated by the motion picture encoding method or by the mobile picture encoding device in each one of the modalities. When it is determined that the current type or video current attribute information indicates that the multiplexed data is generated by the motion picture encoding method or by the mobile picture encoding apparatus in each of the modalities, in Step exS102, decoding is carried out by the method of decoding figuration in motion in each of the modalities. In addition, when the current type or video current attribute information indicates compliance with conventional standards such as MPEG-2, MPEG-4 AVC and VC-1, in Step exS103, decoding is performed by a method of decoding moving figuration in accordance with conventional standards.
[00349] Thus, the allocation of a new unique value to the current type or to the video current attribute information enables the determination of whether the motion picture decoding method or the mobile picture decoding device that is described in each of the modalities you can perform or not decode. Even when multiplexed data that conforms to a different standard is input, an appropriate decoding method or device can be selected. Thus, it becomes possible to decode information without any error. In addition, the motion picture or apparatus encoding method or the motion picture or apparatus decoding method in the present embodiment can be used in the devices and systems described above.
92/102 (MODALITY 11) [00350] Each of the motion picture encoding method, the mobile picture encoding device, the motion picture decoding method and the mobile picture decoding device in each of the modalities is usually obtained in the form of an integrated circuit or a Large Scale Integrated Circuit (LSI). As an example of the LSI, Figure 31 illustrates a configuration of the LSI ex500 that is manufactured on a chip. The LSI ex500 includes elements ex501, ex502, ex503, ex504, ex505, ex506, ex507, ex508 and ex509 to be described below and the elements are connected to each other via an ex510 bus. The ex505 power supply circuit unit is activated by supplying each element with a power when the ex505 power supply circuit unit is turned on.
[00351] For example, when encoding is performed, the LSI ex500 receives an AV signal from an ex117 microphone, an ex113 camera and others via an ex509 AV IO under the control of an ex501 control unit that includes an ex502 CPU , an ex503 memory controller, an ex504 current controller and an ex512 drive frequency control unit. The received AV signal is temporarily stored in an external memory ex511 as an SDRAM. Under the control of the ex501 control unit, the stored data is segmented into data portions according to the amount and processing speed to be transmitted to an ex507 signal processing unit. The signal processing unit ex507 then encodes an audio signal and / or a video signal. Here, the encoding of the video signal is the encoding described in each of the modalities. In addition, the ex507 signal processing unit sometimes multiplexes the encoded audio data and the encoded video data and a current IO ex506
93/102 provides the multiplexed data out. The supplied multiplexed data is transmitted to the base station ex107 or written to the recording medium ex215. When data sets are multiplexed, the data must be stored temporarily in temporary storage ex508 so that the data sets are synchronized with each other.
[00352] Although the ex511 memory is an element outside the LSI ex500, it can be included in the LSI ex500. Ex508 temporary storage is not limited to temporary storage, but can be made up of temporary storage. In addition, the LSI ex500 can be manufactured on a chip or on a plurality of chips.
[00353] Furthermore, although the ex501 control unit includes the ex502 CPU, the ex503 memory controller, the ex504 current controller, the ex512 drive frequency control unit, the configuration of the ex501 control unit is not limited to such. For example, the signal processing unit ex507 may additionally include a CPU. The addition of another CPU to the ex507 signal processing unit can improve processing speed. In addition, as another example, the ex502 CPU may serve as or be part of the ex507 signal processing unit and, for example, may include an audio signal processing unit. In such a case, the control unit ex501 includes the signal processing unit ex507 or the CPU ex502 which includes a part of the signal processing unit ex507.
[00354] The name used here is LSI, but it can also be called Cl, LSI system, super LSI or ultra LSI depending on the degree of integration.
[00355] In addition, the ways of achieving integration are not limited to LSI and a special circuit or purpose processor.
94/102 to general and so on can also achieve integration. The Field Programmable Port Array (FPGA) that can be programmed after manufacturing LSis or a reconfigurable processor that allows a reconfiguration of the connection or configuration of an LSI can be used for the same purpose. Such a programmable logic device can normally perform the motion picture encoding method and / or the motion picture decoding method according to any of the above modes, by loading or reading a memory or similar from one or more programs that are included in software or firmware.
[00356] In the future, with the advance in semiconductor technology, a new technology can replace LSI. Functional blocks can be integrated using such technology. The possibility is that the present invention is applied to biotechnology.
(MODE 12) [00357] When video data generated in the motion picture encoding method or by the mobile picture encoding device described in each of the modes is decoded, it is possible that the amount of processing will increase compared to when data video that conform to a conventional standard such as MPEG-2, MPEG-4 AVC and VC-1 are decoded. Thus, the LSI ex500 needs to be set to a trigger frequency higher than that of the ex502 CPU to be used when video data in accordance with the conventional standard is decoded. However, when the drive frequency is set higher, there is a problem in which the power consumption increases.
[00358] To solve the problem, the mobile picture decoding device such as the television ex300 and the LSI ex500 are configured to determine which standard the video data conforms to and switch between the trigger frequencies according to the standard
95/102 determined. Figure 32 illustrates an ex800 configuration in the present embodiment. A switching frequency switching unit ex803 sets a switching frequency to a higher switching frequency when video data is generated by the motion picture encoding method or the mobile picture encoding apparatus described in each of the modalities. Then, the drive frequency switching unit ex803 instructs a decoding processing unit ex801 which performs the motion picture decoding method described in each of the modes for decoding the video data. When the video data conforms to the conventional standard, the drive frequency switching unit ex803 sets a drive frequency to a lower drive frequency than that of the video data generated by the motion picture encoding method or the coding of mobile figuration described in each of the modalities. Then, the drive frequency switching unit ex803 instructs the decoding processing unit ex802 which conforms to the conventional standard for decoding video data.
[00359] More specifically, the drive frequency switching unit ex803 includes the CPU ex502 and the drive frequency control unit ex512 in Figure 31. Here, each of the decoding processing unit ex801 that performs the method of motion picture decoding described in each of the modalities and the decoding processing unit ex802 that conforms to the conventional standard corresponds to the signal processing unit ex507 in Figure 31. CPU ex502 determines which standard the video data conforms to . The drive frequency control unit ex512 then determines a drive frequency based on a signal from the
96/102
Ex502 CPU. In addition, the ex507 signal processing unit decodes the video data based on the ex502 CPU signal. For example, it is possible that the identifying information described in Mode 10 is used to identify the video data. The identifying information is not limited to that described in Modality 10, but it can be any information as long as the information indicates which standard the video data conforms to. For example, when video data conforms to a standard it can be determined based on an external signal to determine whether the video data is used for a television or a disc, etc., the determination can be made based on such an external signal. In addition, the ex502 CPU selects a trigger frequency based, for example, on a lookup table in which the video data patterns are associated with the trigger frequencies as shown in Figure 34. The trigger frequency can be selected by lookup table storage in temporary storage ex508 and in an internal memory of an LSI and, with reference, to the lookup table by CPU ex502.
[00360] Figure 33 illustrates steps to execute a method in the present modality. First, in Step exS200, the signal processing unit ex507 obtains identification information from the multiplexed data. Then, in Step exS201, the CPU ex502 determines whether or not video data is generated or not by the encoding method and the encoding device described in each of the modalities based on the identification information. When the video data is generated by the motion picture encoding method and the mobile picture encoding apparatus described in each of the modalities, in Step exS202, the CPU ex502 transmits a signal to set the trigger frequency to a frequency of larger drive for the frequency control unit
97/102 drive ex512. The drive frequency control unit ex512 then sets the drive frequency to the higher drive frequency. On the other hand, when the identification information indicates that the video data conforms to the conventional standard such as MPEG-2, MPEG-4 AVC and VC-1, in Step exS203, the CPU ex502 transmits a signal to define the trigger frequency for a lower drive frequency for the drive frequency control unit ex512. Then, the drive frequency control unit ex512 sets the drive frequency to the drive frequency below that in the case where the video data is generated by the motion picture encoding method and the mobile picture encoding apparatus described in each modality.
[00361] In addition, together with the switching of the drive frequencies, the power conservation effect can be improved by changing the voltage to be applied to the LSI exõOO or to a device that includes the LSI exõOO. For example, when the drive frequency is set lower, it is possible that the voltage to be applied to the LSI ex500 or the device that includes the LSI exõOO is set to a voltage lower than that in the case where the drive frequency is set higher.
[00362] In addition, when the amount of processing for decoding is higher, the triggering frequency can be set higher and, when the amount of processing for decoding is lower, the triggering frequency can be set lower as the method for setting the activation frequency. Thus, the definition method is not limited to those described above. For example, when the amount of processing for decoding MPEG-4 AVC video data is greater than the amount of processing for decoding video data generated by the method
98/102 all of the motion picture coding and the mobile picture coding apparatus described in each of the modalities, it is possible that the activation frequency is set in reverse order to the definition described above.
[00363] In addition, the method for setting the trigger frequency is not limited to a method for setting the lower trigger frequency. For example, when the identification information indicates that the video data is generated by the motion picture encoding method and by the mobile picture encoding device described in each of the modalities, it is possible that the voltage to be applied to the LSI ex500 or the device that includes the LSI ex500 is set higher. When the identification information indicates that the video data conforms to the conventional standard such as MPEG-2, MPEG-4 AVC and VC-1, it is possible that the voltage to be applied to the LSI ex500 or the device that includes the LSI exõOO is defined smaller. As another example, it is possible that when the identification information indicates that the video data is generated by the motion picture encoding method and the mobile picture encoding device described in each of the modes, the activation of the CPU ex502 does not is suspended and, when the identification information indicates that the video data conforms to the conventional standard such as MPEG-2, MPEG-4 AVC and VC-1, the ex502 CPU activation is suspended at a given time due to the ex502 CPU having extra processing capacity. It is possible that, even when the identification information indicates that the video data is generated by the motion picture encoding method and by the mobile picture encoding apparatus described in each of the modalities, in the case where the CPU ex502 has a capacity extra processing, the ex502 CPU will be suspended at any given time. In such a case, it is possible
99/102 that the suspension time is set less than that in the case where the identification information indicates that the video data conforms to the conventional standard as MPEG-2, MPEG-4 AVC and VC-1.
[00364] Consequently, the power conservation effect can be enhanced by switching between the drive frequencies according to the standard to which the video data conforms. In addition, when the LSI ex500 or the device that includes the LSI exõOO is activated with the use of a battery, the life of the battery can be extended with the effect of conserving power.
(MODE 13) [00365] There are cases where a plurality of video data that conform to different standards is provided for devices and systems such as a television and a cell phone. To allow the decoding of the plurality of video data that conform to different standards, the signal processing unit ex507 of LSI exõOO needs to conform to different standards. However, problems with scaling the LSI exõOO circuit and increasing cost arise with the individual use of the ex507 signal processing units that conform to the respective standards.
[00366] To solve the problem, what is conceived is a configuration in which the decoding processing unit to implement the motion picture decoding method described in each of the modalities and the decoding processing unit that conforms to the conventional standard like MPEG2, MPEG-4 AVC and VC-1 are partially shared. The Ex900 in Figure 35A shows an example of the configuration. For example, the motion picture decoding method described in each of the modalities and the motion picture decoding method that conforms to MPEG-4 AVC have, partially
100/102 common, processing details such as entropy coding, inverse quantization, deblocking filtering and motion compensated prediction. It is possible that an ex902 decoding processing unit that conforms to MPEG-4 AVC is shared by common processing operations and that a dedicated ex901 decoding processing unit is used for processing that is unique to one aspect of the present invention and not conforms to MPEG-4 AVC. It is possible that an ex902 decoding processing unit that conforms to MPEG-4 AVC is shared by common processing operations and a dedicated ex901 decoding processing unit is used for processing that is unique to one aspect of the present invention and is not conforms to MPEG-4 AVC. The decoding processing unit to implement the motion picture decoding method described in each of the modalities can be shared for the processing to be shared and a dedicated decoding processing unit can be used for processing unique to that of MPEG-4 Stroke.
[00367] In addition, the ex1000 in Figure 35B shows another example in which a processing is partially shared. This example uses a configuration that includes a dedicated decoding processing unit ex1001 that supports single processing to an aspect of the present invention, a dedicated decoding processing unit ex1002 that supports processing single to another conventional standard, and a processing processing unit. decoding ex1003 that supports processing to be shared between the motion picture decoding method according to the aspect of the present invention and the conventional motion picture decoding method. Here, the dedicated decoding processing units ex1001 and ex1002 are not
101/102 necessarily specialized for processing according to the aspect of the present invention and for processing the conventional standard, respectively, and can be those that can implement general processing. In addition, the configuration of this modality can be implemented by LSI ex500.
[00368] In this way, reducing the scale of the circuit of an LSI and reducing the cost are possible by sharing the decoding processing unit for the processing to be shared between the moving picture decoding method according to the aspect of the the present invention and the method of decoding motion figures in accordance with the conventional standard.
INDUSTRIAL APPLICABILITY [00369] An image encoding method and an image decoding method according to the present invention can be applied to various multimedia data. The image encoding method and the image decoding method according to the present invention are useful as an image encoding method and an image decoding method in storage, transmission, communication and the like using a mobile phone. , a DVD device, a personal computer and the like.
LIST OF REFERENCE SIGNS
100 encoder
105 subtractor
110 processing unit
120 unit of quantization
130, 230 reverse transformation unit
140, 240 adder
150, 250 deblocking filter
160, 260 adaptive loop filter
102/102
170, 270 frame memory
180, 280 prediction unit
190 entropy encoder
200 decoder
290 entropy decoder
300, 400, 710 figuration
31.32, 3i, 41.42 LCU line
311,312, 3,331 LCU
500 packet header
510 IP header
520, 550 extension field
530 UDP header
540 RTP header
560 payload header
570 NAL header s1 input signal s2 prediction signal e, e ! prediction error signal s ! , s, s3 reconstructed signal
权利要求:
Claims (13)
[1]
1. Image coding method for performing coding processing by partitioning a picture into a plurality of slices, and the image coding method is characterized by the fact that it comprises transmitting a bit stream that includes: an indicator that enable dependent slice indicating whether or not the figure includes a dependent slice on which the encoding processing is performed depending on a result of the encoding processing on a different slice than a current slice; a slice address that indicates a starting position for the current slice; and a dependency indication that indicates whether the current slice is the dependent slice or not, where the indicator that enables the dependent slice is arranged in a set of parameters common to the slices, the slice address being arranged in a slice header of the current slice, and the dependency indication is placed in the slice header and is placed before the slice address and after a syntax element that identifies the set of parameters.
[2]
2. Image encoding method, according to claim 1, characterized by the fact that the dependency indication is included in the bit stream when the indicator that enables the dependent slice indicates an inclusion of the dependent slice.
[3]
3. Image coding method, according to any of claims 1 and 2, characterized by the fact that the indicator that enables dependent slice is arranged at the beginning of the set of parameters.
[4]
4. Image encoding method according to any one of claims 1 to 3, characterized by the fact that each of the slices includes a plurality of macroblocks, and
2/4 the encoding processing on the current slice is started after the encoding processing is performed on two of the macroblocks included in an immediately preceding current slice.
[5]
5. Image encoding method according to any one of claims 1 to 4, characterized by the fact that the dependency indication is not included in a slice header of a slice that is processed first for figuring between the slices.
[6]
6. Image decoding method for performing decoding processing by partitioning a picture into a plurality of slices, the image decoding method being characterized by the fact that it comprises extracting an indicator from a coded bit stream which enables a dependent slice that indicates whether or not the figure includes the dependent slice on which decoding processing is performed depending on a result of decoding processing on a slice other than a current slice, a slice address that indicates a starting position of the current slice and a dependency indication that indicates whether or not the current slice is the dependent slice, where the indicator that enables the dependent slice is arranged in a set of parameters common to the slices, the slice address is arranged in a header slice of the current slice, and the dependency indication is arranged in the slice header and is arranged before the slice address and after a syntax element that identifies the parameter set.
[7]
7. Image decoding method, according to claim 6, characterized by the fact that the dependency indication is extracted from the bit stream when the indicator that enables the dependent slice indicates an inclusion of the dependent slice.
[8]
8. Image decoding method, according to
3/4 any of claims 6 and 7, characterized by the fact that the indicator that enables dependent slice is arranged at the beginning of the set of parameters.
[9]
9. Image decoding method according to any of claims 6 to 8, characterized in that each of the slices includes a plurality of macroblocks, and the decoding processing in the current slice is started after the decoding processing is performed on two of the macroblocks included in a current slice immediately preceding.
[10]
10. Image decoding method according to any one of claims 6 to 9, characterized by the fact that the dependency indication is not included in a slice header of a slice that is processed first for figuring between the slices.
[11]
11. Image encoding device that performs encoding processing by partitioning a picture into a plurality of slices, the image encoding device being characterized by the fact that it comprises an encoder that transmits a bit stream that includes: a indicator that enables dependent slice indicating whether or not the figure includes a dependent slice on which the encoding processing is performed depending on a result of the encoding processing on a different slice than a current slice; a slice address that indicates a starting position for the current slice; and a dependency indication that indicates whether the current slice is the dependent slice or not, where the indicator that enables the dependent slice is laid out in a set of parameters common to the slices, the slice address is laid out in a slice slice header current, and
4/4 the dependency indication is displayed in the slice header and is displayed before the slice address and after a syntax element that identifies the set of parameters.
[12]
12. Image decoding device that performs decoding processing by partitioning a picture into a plurality of slices, and the image decoding device is characterized by the fact that it comprises a decoder that extracts, from a coded bit stream, an indicator that enables dependent slice indicating whether or not the figuration includes a dependent slice on which decoding processing is performed depending on a result of decoding processing on a different slice than a current slice, a slice address indicating a position starting point of the current slice and a dependency indication that indicates whether the current slice is the dependent slice or not, where the indicator that enables the dependent slice is arranged in a set of parameters common to the slices, the slice address is arranged in a header slice of the current slice, and the dependency indication is displayed in the slice header and it is placed before the slice address and after a syntax element that identifies the set of parameters.
[13]
13. Image encoding and decoding device characterized by the fact that it comprises:
the image encoding apparatus as defined in claim 11; and the image decoding apparatus as defined in claim 12.
类似技术:
公开号 | 公开日 | 专利标题
JP6558784B2|2019-08-14|Method, apparatus, and medium
CA2882792C|2021-08-03|Image decoding method, image coding method, image decoding apparatus, image coding apparatus, and image coding and decoding apparatus
CA2882731C|2018-10-09|Image decoding method, image coding method, image decoding apparatus, image coding apparatus, and image coding and decoding apparatus
WO2013144144A1|2013-10-03|Syntax and semantics for adaptive loop filter and sample adaptive offset
同族专利:
公开号 | 公开日
DK3122048T3|2018-03-12|
MY176984A|2020-08-31|
JP6758456B2|2020-09-23|
PH12015500365B1|2015-04-20|
TR201802584T4|2018-03-21|
PT3122048T|2018-04-20|
ES2780006T3|2020-08-21|
US9357234B2|2016-05-31|
ES2664361T3|2018-04-19|
JP6317015B2|2018-04-25|
EP3654649A1|2020-05-20|
CN104737541A|2015-06-24|
EP2903267B1|2017-04-05|
JP2017192144A|2017-10-19|
SG11201500846TA|2015-05-28|
EP3301923B1|2020-01-08|
US10616605B2|2020-04-07|
PH12017501838B1|2018-07-02|
CA2881221C|2021-04-27|
US20180084282A1|2018-03-22|
AU2013322008B2|2016-10-27|
RU2653236C2|2018-05-07|
CN108282655A|2018-07-13|
US9503755B2|2016-11-22|
US20140093180A1|2014-04-03|
HK1253286A1|2019-06-14|
JP2018125881A|2018-08-09|
RU2756093C2|2021-09-28|
TWI593274B|2017-07-21|
CN108282655B|2021-09-10|
EP3876536A1|2021-09-08|
JP6558784B2|2019-08-14|
US20200195975A1|2020-06-18|
AU2013322008A2|2015-03-05|
PH12015500365A1|2015-04-20|
PL2903267T3|2017-09-29|
EP2903267A4|2015-09-09|
EP3654649B1|2021-05-26|
EP3122048A1|2017-01-25|
MX339463B|2016-05-27|
MX2015002889A|2015-07-06|
RU2015103543A|2016-11-20|
EP3122048B1|2018-01-17|
US20150131738A1|2015-05-14|
JPWO2014050038A1|2016-08-22|
JP2019205183A|2019-11-28|
EP2903267A1|2015-08-05|
PH12017501838A1|2018-07-02|
KR20150063356A|2015-06-09|
KR102169058B1|2020-10-23|
RU2018111944A|2019-02-28|
ES2630359T3|2017-08-21|
US9872043B2|2018-01-16|
KR20200013098A|2020-02-05|
AU2013322008A1|2015-02-26|
CN104737541B|2018-04-10|
WO2014050038A1|2014-04-03|
US9014494B2|2015-04-21|
TW201429253A|2014-07-16|
KR102072832B1|2020-02-03|
US20170034534A1|2017-02-02|
PH12019501972A1|2021-02-08|
US20160241879A1|2016-08-18|
JP6172535B2|2017-08-02|
PL3122048T3|2018-07-31|
EP3301923A1|2018-04-04|
CA2881221A1|2014-04-03|
RU2018111944A3|2021-03-31|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US7903742B2|2002-07-15|2011-03-08|Thomson Licensing|Adaptive weighting of reference pictures in video decoding|
CN101796846B|2007-04-17|2013-03-13|诺基亚公司|Feedback based scalable video coding|
CN101389021B|2007-09-14|2010-12-22|华为技术有限公司|Video encoding/decoding method and apparatus|
US8938009B2|2007-10-12|2015-01-20|Qualcomm Incorporated|Layered encoded bitstream structure|
CA2701877A1|2007-10-15|2009-04-23|Nokia Corporation|Motion skip and single-loop encoding for multi-view video content|
US8126054B2|2008-01-09|2012-02-28|Motorola Mobility, Inc.|Method and apparatus for highly scalable intraframe video coding|
WO2010005691A1|2008-06-16|2010-01-14|Dolby Laboratories Licensing Corporation|Rate control model adaptation based on slice dependencies for video coding|
JP5341104B2|2008-12-08|2013-11-13|パナソニック株式会社|Image decoding apparatus and image decoding method|
US8705879B2|2009-04-01|2014-04-22|Microsoft Corporation|Image compression acceleration using multiple processors|
JP4957831B2|2009-08-18|2012-06-20|ソニー株式会社|REPRODUCTION DEVICE AND REPRODUCTION METHOD, RECORDING DEVICE AND RECORDING METHOD|
KR101504887B1|2009-10-23|2015-03-24|삼성전자 주식회사|Method and apparatus for video decoding by individual parsing or decoding in data unit level, and method and apparatus for video encoding for individual parsing or decoding in data unit level|
GB2488159B|2011-02-18|2017-08-16|Advanced Risc Mach Ltd|Parallel video decoding|
US9338465B2|2011-06-30|2016-05-10|Sharp Kabushiki Kaisha|Context initialization based on decoder picture buffer|
ES2789024T3|2012-04-12|2020-10-23|Velos Media Int Ltd|Extension data management|
PL2842313T3|2012-04-13|2017-06-30|Ge Video Compression, Llc|Scalable data stream and network entity|
US20130343465A1|2012-06-26|2013-12-26|Qualcomm Incorporated|Header parameter sets for video coding|
US20140086328A1|2012-09-25|2014-03-27|Qualcomm Incorporated|Scalable video coding in hevc|
US9491457B2|2012-09-28|2016-11-08|Qualcomm Incorporated|Signaling of regions of interest and gradual decoding refresh in video coding|
EP2904797B1|2012-10-01|2021-08-11|Nokia Technologies Oy|Method and apparatus for scalable video coding|BR112012014685A2|2009-12-18|2016-04-05|Sharp Kk|image filter, encoding device, decoding device and data structure.|
MX344952B|2012-07-09|2017-01-12|Vid Scale Inc|Codec architecture for multiple layer video coding.|
KR20150092105A|2012-12-06|2015-08-12|소니 주식회사|Decoding device, decoding method, and program|
ES2733223T3|2013-01-04|2019-11-28|Samsung Electronics Co Ltd|Entropy decoding procedure of cutting segments|
US9628792B2|2013-07-15|2017-04-18|Qualcomm Incorporated|Cross-layer parallel processing and offset delay parameters for video coding|
US10178397B2|2014-03-24|2019-01-08|Qualcomm Incorporated|Generic use of HEVC SEI messages for multi-layer codecs|
US10306239B2|2014-05-13|2019-05-28|Telefonaktiebolaget Lm Ericsson |Methods, source device, target device and analyser for managing video coding|
US10038915B2|2014-05-22|2018-07-31|Qualcomm Incorporated|Escape sample coding in palette-based video coding|
KR102276854B1|2014-07-31|2021-07-13|삼성전자주식회사|Method and apparatus for video encoding for using in-loof filter parameter prediction, method and apparatus for video decoding for using in-loof filter parameter prediction|
JP2017526228A|2014-08-07|2017-09-07|ソニック アイピー, インコーポレイテッド|System and method for protecting a base bitstream incorporating independently encoded tiles|
JP6365102B2|2014-08-14|2018-08-01|富士ゼロックス株式会社|Data processing apparatus and program|
US10123028B2|2014-09-17|2018-11-06|Mediatek Inc.|Syntax parsing apparatus with multiple syntax parsing circuits for processing multiple image regions within same frame or processing multiple frames and related syntax parsing method|
US10212445B2|2014-10-09|2019-02-19|Qualcomm Incorporated|Intra block copy prediction restrictions for parallel processing|
US10574993B2|2015-05-29|2020-02-25|Qualcomm Incorporated|Coding data using an enhanced context-adaptive binary arithmetic codingdesign|
US20170105010A1|2015-10-09|2017-04-13|Microsoft Technology Licensing, Llc|Receiver-side modifications for reduced video latency|
US10827186B2|2016-08-25|2020-11-03|Intel Corporation|Method and system of video coding with context decoding and reconstruction bypass|
US10805611B2|2016-10-18|2020-10-13|Mediatek Inc.|Method and apparatus of constrained sequence header|
CN106534137B|2016-11-18|2020-01-14|浙江宇视科技有限公司|Media stream transmission method and device|
US10469876B2|2016-12-22|2019-11-05|Mediatek Inc.|Non-local adaptive loop filter combining multiple denoising technologies and grouping image patches in parallel|
US10291936B2|2017-08-15|2019-05-14|Electronic Arts Inc.|Overcoming lost or corrupted slices in video streaming|
CN110401836A|2018-04-25|2019-11-01|杭州海康威视数字技术股份有限公司|A kind of image decoding, coding method, device and its equipment|
US11216923B2|2018-05-23|2022-01-04|Samsung Electronics Co., Ltd.|Apparatus and method for successive multi-frame image denoising|
CN112703743A|2018-09-14|2021-04-23|华为技术有限公司|Stripes and partitions in video coding|
KR20210104900A|2018-12-31|2021-08-25|후아웨이 테크놀러지 컴퍼니 리미티드|Video encoders, video decoders and corresponding methods|
EP3903492A4|2018-12-31|2022-02-23|Huawei Tech Co Ltd|Tile group signaling in video coding|
KR20210019530A|2019-06-24|2021-02-22|텔레폰악티에볼라겟엘엠에릭슨|Signaling parameter value information of a parameter set to reduce the amount of data included in the encoded video bitstream|
WO2020263132A1|2019-06-28|2020-12-30|Huawei Technologies Co., Ltd.|Method and apparatus for lossless still picture and video coding|
WO2020263133A1|2019-06-28|2020-12-30|Huawei Technologies Co., Ltd.|Method and apparatus for still picture and video coding|
WO2021045765A1|2019-09-05|2021-03-11|Huawei Technologies Co., Ltd.|Efficient adaptive loop filter parameter signaling in video coding|
US20210136419A1|2019-11-04|2021-05-06|Mediatek Inc.|Signaling High-Level Information In Video And Image Coding|
WO2021045656A2|2020-01-03|2021-03-11|Huawei Technologies Co., Ltd.|An encoder, a decoder and corresponding methods of flexible profile configuration|
WO2021170132A1|2020-02-28|2021-09-02|Huawei Technologies Co., Ltd.|An encoder, a decoder and corresponding methods simplifying signalling slice header syntax elements|
WO2021198491A1|2020-04-02|2021-10-07|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|File format schemes allowing efficient roi, stream access and parameter set handling|
CN112822514A|2020-12-30|2021-05-18|北京大学|Video stream packet transmission method, system, terminal and medium based on dependency relationship|
法律状态:
2017-07-11| B25A| Requested transfer of rights approved|Owner name: SUN PATENT TRUST (US) |
2018-02-06| B25A| Requested transfer of rights approved|Owner name: VELOS MEDIA INTERNATIONAL LIMITED (IE) |
2018-03-27| B15K| Others concerning applications: alteration of classification|Ipc: H04N 7/00 (2011.01) |
2018-11-21| B06F| Objections, documents and/or translations needed after an examination request according [chapter 6.6 patent gazette]|
2020-07-14| B15K| Others concerning applications: alteration of classification|Free format text: A CLASSIFICACAO ANTERIOR ERA: H04N 7/00 Ipc: H04N 19/174 (2014.01), H04N 19/30 (2014.01), H04N |
2020-07-21| B06U| Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]|
2021-10-13| B350| Update of information on the portal [chapter 15.35 patent gazette]|
优先权:
申请号 | 申请日 | 专利标题
US201261705846P| true| 2012-09-26|2012-09-26|
US201261711892P| true| 2012-10-10|2012-10-10|
PCT/JP2013/005541|WO2014050038A1|2012-09-26|2013-09-19|Image encoding method, image decoding method, image encoding device, image decoding device, and image encoding/decoding device|
[返回顶部]